The effects of mobile apps on shopper purchases and product returns

Electronic copy available at: https://ssrn.com/abstract=2878903

THE EFFECTS OF MOBILE APPS ON SHOPPER PURCHASES AND PRODUCT RETURNS

Unnati NarangVenkatesh Shankar*

December 1, 2016

* Unnati Narang ([email protected]) is a PhD student in Business Administration-Marketing at Mays Business School, Texas A&M University. Venkatesh Shankar ([email protected]) is Professor of Marketing and Coleman Chair in Marketing and Director of Research, Center for Retailing Studies, Mays Business School, Texas A&M University. We thank participants at the 2016 Theory and Practice in Marketing (TPM) Conference and the 2016 Professors’ Institute meeting of Marketing EDGE for valuable comments.


THE EFFECTS OF MOBILE APPS ON SHOPPER PURCHASES AND PRODUCT RETURNS

Do mobile apps influence shopper purchases and product returns? We model the effects of app adoption in the context of a large omnichannel retailer with 32 million shoppers. We leverage the launch of a mobile app by the retailer and use a difference-in-differences approach to identify and estimate the differences between app adopters and non-adopters in shopping outcomes, such as the incidence and monetary value of purchases and product returns. We find that app adopters buy 21% more often but spend 12% less per purchase occasion and return 73% more often than non-adopters in the month after adoption. Overall, app adoption results in a 24% increase in net monetary value of purchases. Our findings are robust to alternative explanations and measures. Furthermore, our analysis of the drivers of app use reveals that exposure to offers and rewards through the app plays a key role in driving shopping outcomes. Surprisingly, the number of unique app features accessed by the shopper has an inverted U-shaped relationship with shopping outcomes, suggesting managerial caution against “all-in-one” app designs.

Keywords: difference-in-differences, exponential Type II Tobit, mobile marketing, mobile apps, quasi-experiments


1

1. Introduction

In recent years, the penetration of mobile devices has reached unprecedented levels. By January

2016, 79.1% of the United States’ (U.S.) population or 198.5 million people owned a smartphone

(comScore 2016a). By the end of 2016, over two billion people worldwide will be smartphone

users (eMarketer 2014) and by 2020, more than 70% of the world population will own a

smartphone (Ericsson Mobility Report 2016).

Mobile devices play a unique role in influencing shoppers along and beyond their paths to

purchase. Mobile devices are interactive, engaging, portable, wireless, location-specific, and

personable. As a result, they are uniquely positioned to influence shoppers at various stages of

the shopping process – need recognition, information search, alternative evaluation, purchase,

and post purchase (Shankar and Balasubramanian 2009).

Little wonder, mobile marketing is becoming a strategic priority for firms. U.S. firms spent

over $28 billion on mobile advertising in 2015 and are projected to double this level by 2018

(eMarketer 2016). Chief Marketing Officers (CMOs) of leading firms already allocate up to 20%

of their budget to mobile (Forrester 2016). Terry Lundgren, Macy’s CEO, views mobile to be the

starting point for shopping when he says, “shoppers are starting the journey with their phone,

doing their research. Then they might buy in the store or they’ll buy at Macys.com or

Bloomingdales.com” (Peterson 2015). Mobile devices have changed the way people shop, giving

rise to an emerging area of mobile shopper marketing and revolutionizing retail (Shankar et al.

2016). More than 80% of U.S. shoppers use a mobile device to shop even within a store (Google

M/A/R/C Study 2013). Nearly 70% of Amazon’s customers used mobile to shop in the 2015

holiday season (Eadicicco 2015).

2

Mobile apps are increasingly dominating mobile device use. Mobile apps account for 87% of

mobile usage, which constitutes the bulk of digital media time (comScore 2016b). By 2015, there

were over 250 billion app downloads from the App Store and Google Play (Sims 2015). About

20% of all Starbucks transactions originated from its “order and pay” app (Forbes 2015).

Do mobile apps influence shopper behavior? Mobile apps offer informational (e.g., product

and store information) and experiential (e.g., offer, loyalty program reward redemption) benefits.

These benefits may lead shoppers to purchase more often and spend more money. However,

while a mobile app can induce purchases through the app or mobile web, does it increase overall

purchases across all the channels, including brick-and-mortar and online? Furthermore, a mobile

app can prompt a shopper to act and make a purchase, but such an action can also result in post-

purchase regret, leading to higher product returns. Therefore, the net effect of mobile apps on the

monetary value of purchases is unclear.

Furthermore, app features such as product search, store check-in, loyalty program, and

promotional offer, may have specific effects on shopper purchases and returns and help explain

the effects of a mobile app on shopping outcomes. The use of a greater number of app features

may lead to more purchases and even product returns.

Despite the widespread use and the potentially significant effects of mobile apps and their

features, the consequences of mobile apps and the effects of app features have been

underexplored. A few studies (e.g., Kim et al. 2015; Gill et al. 2016) have considered the effects

of mobile apps on loyalty points accrued, purchase intent, website visits, or aggregate purchase

amounts. We extend prior research by isolating the effects of mobile app adoption on a richer,

managerially important set of outcomes, such as individual purchase incidence and amount and

3

return incidence and amount. Importantly, we explain these effects through app feature-related

mechanisms. Specifically, we address three research questions:

Does mobile app adoption lead to higher or lower incidence and monetary value of purchases and returns?

What are the sizes of differences in purchases and returns between app adopters and non-adopters?

What are the effects of the number and type of app features on shopping outcomes and how do they help explain the overall effects of a mobile app on shopping outcomes?

We address our research questions using a unique dataset from a large omnichannel retailer

of video games, consumer electronics and wireless services. Teasing out the effects of mobile

apps on purchase outcomes is complex. A major challenge is endogeneity and self-selection

potentially confounding the effects of variables affecting both app adoption and shopping

outcomes. We tackle this challenge in two ways by: (a) adopting a combination of difference-in-

differences method with propensity score matching and Heckman selection correction, and (b)

carrying out a series of robustness tests to rule out alternative explanations.

Our results show that app adopters buy 21% more often but spend 12% less per purchase

occasion and return 73% more often than non-adopters in the month after adoption. Overall, app

adoption results in a 24% increase in net monetary value of purchases. Surprisingly, the number

of unique app features accessed by the shopper has an inverted U-shaped relationship with

shopping outcomes, suggesting managerial caution against “all-in-one” app designs.

Our research contributes to the mobile marketing and omnichannel marketing literatures in at

least three ways. First, we expand the scope of mobile marketing literature by examining the

impact of mobile apps on both purchases and returns across channels. To our knowledge, no

other mobile marketing study has examined the net monetary value by accounting for returns.

Second, unlike most prior studies that focus on associations between mobile interventions (e.g.,

coupons) and purchase outcomes, we isolate the effect of mobile app adoption on purchases and

4

returns. Finally, we examine the effects of a comprehensive set of mobile app features that

enable managers to improve resource allocation to the conceptualization, development, launch,

and maintenance of mobile apps.

2. Related Literature and Framework for Empirical Analysis

Mobile devices influence shoppers both in- and out-of-store by offering convenient and

interactive anytime-anywhere access to relevant information (Shankar et al. 2016). Mobile apps

may affect purchases in two major ways. First, mobile apps can provide information benefits at

the right time and place to shoppers. Such benefits include product information, product reviews,

and store location information (Danaher et al. 2015; Dubé et al. 2015; Fong et al. 2015). Second,

mobile apps can offer experiential/interactive benefits through loyalty program use, notification,

offers, and store check-ins (Shankar and Balasubramanian 2009). For example, Starbucks offers

consumers location-based promotions via its mobile app for declaring their loyalty on social

networks and status badges for store check-ins and a mobile pay option (Andrews et al. 2016a).

Two streams of research are relevant to our research questions. First, the literature on mobile

apps has focused on the effects of apps on a few outcomes. Mobile app use improves brand

attitude and purchase intention (Bellman et al. 2011). A brand’s mobile app also promotes visits

to its website (Xu et al. 2014), enhances loyalty points accrued (Kim et al. 2015), and can

influence purchase probability (Dinner et al. 2015). Furthermore, the use of mobile app can

increase shoppers’ spending in the business-to-consumer (B2C) (Einav et al. 2014) as well as the

business-to-business (B2B) (Gill et al. 2016) context. Informational apps are more effective in

driving purchase intention than experiential apps (Bellman et al. 2011) and app features, such as

information lookup significantly affect loyalty points accrual (Kim et al. 2015).

5

Second, studies on mobile device adoption have concentrated on its effects on purchase

incidence or monetary value. Wang et al. (2015) examine changes in shopper spending after

mobile device adoption and find that purchase incidence and monetary value of purchases

increase. In contrast, Lee et al. (2016) find that the monetary value of each purchase is lower for

shoppers who transact more using mobile than web. Xu et al. (2016) study the effect of tablet

adoption on commerce through smartphones and computers in the online retail context and

conclude that commerce through tablets substitutes desktop commerce but complements

smartphone commerce.

Our study complements and extends these research streams as shown in Table 1. We study

the effects of app adoption a variety of shopping outcomes, purchase incidence, purchase

amount, return incidence, and return amount. In addition, we examine the effects of a

comprehensive set of app features on shopping outcomes.

< Table 1 about here >

Based on the evidence from these related research streams, we develop a conceptual

framework delineating the drivers of shopper decisions and outcomes at various shopping stages.

It appears in Figure 1. In this framework, some existing shoppers of the firm adopt the app, while

others do not. The app adoption decision depends on shopper demographics, past shopping

behavior, and the data connectivity environment. Both adopters and non-adopters of the mobile

app make purchases. In addition to app adoption, recency of last purchase, income and customer

tenure drive the incidence and monetary value of purchases. Some shoppers return some of the

purchases made. The recency of last purchase, distance to the nearest store, number of stores in

the shopper’s zipcode, and order size determine the incidence and monetary value of returns.

< Figure 1 about here >

6

3. Data and Research Setting

We collect data from a large US-based retailer of video games, consumer electronics and

wireless services. Our data span July 2014-June 2015. In addition to transactions-related data

from the retailer’s 4,175 stores across the U.S. and ecommerce website, we have access to data

on mobile app usage of over 32 million customers and members of the retailer’s loyalty program.

The loyalty program contributes to nearly 75 percent of total transactions, reflecting the retailer’s

overall customer base. The retailer’s primary channel is its store network; only a small

proportion of sales transactions take place through its ecommerce site. We have data on

shoppers’ transactions. From these data, we identify the relevant outcomes to map shopper

behavior. Purchase and return incidence (whether a shopper makes a purchase or return) and the

monetary value of purchases and returns are our key outcome variables. The key variables, their

operationalization and descriptive statistics appear in Table 2. We supplement these data with

publicly available data on region-specific data connectivity information (e.g., number of wireless

providers, data speeds) from the US Federal Communications Commission (FCC).


Our focal independent variable is app adoption. The retailer launched its app in July 2014

without any targeted campaign. A subset of shoppers adopted the app over time. The purpose of

the app is to allow shoppers to browse the retailer’s catalog of products, get exposure to deals

and offers, order online, or locate store information to buy offline. The app allows shoppers to

learn about the retailer’s stores, including nearby locations, opening hours, phone numbers, and

driving directions. The app does not offer in-app purchases. Web Appendix A provides

screenshots from the app.

7

The mobile app data are organized at the shopper/app session level. A new session is

recorded when a user first starts the app or loads the app after not loading it in the previous 15

minutes. For each shopper/app session, the data contain a random session ID and the app features

accessed by the shopper. App features capture in-app activities during the session (e.g., browsing

product catalog, clicking offers and checking reward points).

For the analysis, we divide our data into two periods, calibration period and estimation

period. We use the three-month period (August-October 2014) as the calibration period. The

rationale for setting aside this period is to compute shoppers’ past behavioral measures, such as

past spending to help identify similar shoppers for a valid comparison of app adopters and non-

adopters. We treat the months starting November 2014 as our estimation period, identifying and

using shoppers’ app adoption timing as cut-off points for estimating the effects of adoption.

4. Analyses

4.1. Relationship between App Adoption and Shopping Outcomes: Descriptive Analysis

We define app adopters as shoppers who started using the app for the first time during our data

period. Non-adopters are those who did not access the app even once during the study period.

Unlike most prior studies (e.g., Kim et al. 2015; Gill et al. 2016), our data allow us to uniquely

identify each shopper’s app adoption date. We draw random samples of app adopters from

different periods and compare their pre- and post- app adoption outcomes relative to app non-

adopters. Our main analysis reports results for app adopters from December 2014.1

We draw a random sample of adopters and non-adopters with complete demographic

information and who have made at least one purchase in the calibration period (Xu et al. 2016).

Table 3 reports the mean statistics for 1,629 random app adopters who started using the app on

December 1, 2014 and for 7,956 non-adopters. A simple comparison of shopping outcomes

1 We subsequently performed robustness checks using other samples (see Web Appendix B for details).

8

shows that the average monetary value of purchases increased 43.48% ($126.25 to $181.15 per

month) for app adopters, while it increased only by 25.93% ($46.89 to $59.05 per month) for

non-adopters one month before and after adoption (p < 0.001). However, the average monetary

value of returns for app adopters also increased by 96.09% ($9.46 to $18.55 per month)

compared to app non-adopters who experienced a marginal increase of 0.75% ($4.02 to $4.05

per month) in the same period (p < 0.001). Overall, the net monetary value increased by more

than 39.22% for app adopters compared with 28.30% for non-adopters. It is notable that the

number of purchase transactions for the app adopters increased by over 60% relative to non-

adopters who experienced only a 16% increase. As a result, the monetary value per incidence

declined by 10.33% ($85.66 to $76.81 per month) for app adopters, while it increased for non-

adopters by 8.63% ($66.28 to $72 per month). Histograms depicting these model-free data

appear in Figure 2.

< Table 3 and Figure 2 about here >

4.2. Econometric Model: A Quasi-experimental Approach

To estimate the effect of mobile app adoption, the ideal approach would be to compare the

shopping outcomes when shoppers adopt the mobile app to the counterfactual, that is, to

outcomes when the same shoppers do not adopt the mobile app. However, because we do not

observe the counterfactual (a shopper cannot be both an app adopter and a non-adopter) and

because the treatment is not randomly assigned (a shopper self-selects into adopting the app), we

develop a quasi-experimental design to replicate the ideal experimental scenario under

reasonable assumptions (Campbell and Stanley 1963). Specifically, we employ a difference-in-

differences (DIFF-IN-DIFF) approach to compare pre- and post- adoption outcomes for app

adopters and similar non-adopters. After specifying the baseline difference-in-differences

9

regression model, in the next section, we outline our strategy to address the endogeneity of

treatment using Propensity Score Matching (Rosenbaum and Rubin 1983) and Heckman

correction procedures (Heckman 1979). We rule out several competing explanations for our

results in the robustness checks. The complete list of analyses is laid out in Table 4.


Baseline Difference-in-Differences Model

We adopt a difference-in-differences approach to compare the change in outcomes for the

app adopters one month before and one month after app adoption to the change in outcomes for

the non-adopters over the same time period.2 Formally, our baseline difference-in-differences

model can be specified as a two-period linear regression model:

(1) 𝑌𝑖𝑡 = 𝛼0 + 𝛼1𝐴𝑖 + 𝛼2𝑃𝑡 + 𝛼3𝐴𝑖𝑃𝑡 + 𝜗𝑖𝑡

where i is individual, t is month, Y is the outcome variable (number and monetary value of

purchases and returns), A is a dummy variable denoting treatment (1 if shopper i is an app

adopter and 0 otherwise), P is a dummy variable denoting the period (1 for the period after the

app has been downloaded and 0 otherwise), α is a coefficient vector, and is an error term. The 𝜗

coefficient of AiPt (TREAT * POST) identifies the treatment effect.

The underlying identification strategy in the DIFF-IN-DIFF approach is that the change in

outcomes observed in the non-adopter group offers a good counterfactual for the change in

outcomes that would have been observed in the adopter group in the absence of app adoption.

The validity of this assumption relies on the (a) similarity between the app adopters and non-

adopters along their observed and unobserved characteristics (including, for example, common

trends in outcomes in the pre-treatment periods), and (b) absence of any idiosyncratic shock to

2 We subsequently also examined alternative time periods in our (a) comparisons of 15-, 45- and 60-day pre- and post- outcomes (Web Appendix B), and (b) robustness check of app use vs. non-use for extended periods (Table 8).

10

either group in the study period (e.g., no unique marketing promotions should have been sent to

one group and not to the other). In our case, assumption (b) holds since there were no unique

shocks to either group in the data period. In the absence of natural randomization, we ensure the

validity of assumption (a) by employing matching estimates and the Heckman two-step

correction process, carefully observing the balance between two groups through several checks.

4.3. Endogeneity and Self-selection

In the absence of randomization in an observational study, endogeneity of treatment becomes a

major challenge in estimating the causal effects. Two sources of endogeneity exist in our setting.

First, omitted variables can affect both app adoption and shopping behavior. Consider customers

who are gaming and technology enthusiasts. It is possible that they are more interested in the

video game product category and therefore purchase gaming-related products. They may also be

spending more time on their mobile devices, including exploring the available apps. As a result,

their likelihood of adopting the app is higher. Similarly, it is possible that they read app reviews

and technology columns in the media and become more aware of app functionalities. In such

cases, both game purchases and app adoption are likely, but the effect may not necessarily be

causal. Second, mobile app usage and purchase transaction may occur together. Imagine a

customer using a mobile device while purchasing at a store or at a website. Due to the

simultaneous occurrence, it is difficult to tease out causality. This issue can result in endogeneity

from simultaneity (Wooldridge 2002).

Our quasi-experimental research design combined with a series of robustness and

falsification checks allows us to address the endogeneity concern.

4.3.1. Selection on Observables: Matching Estimates

11

Propensity score matching allows us to match app adopters and non-adopters on observed

demographic and behavioral covariates, while tackling the curse of dimensionality. Underlying

propensity score matching is the idea of conceptualizing “the observational data set as having

risen from a complex randomized experiment, where the rules used to assign the treatment

condition have been lost and must be reconstructed” (Rubin 2008; Guo and Fraser 2014).

We begin by calculating each shopper’s propensity score, which is defined as the shopper’s

probability of adopting the app. We do this using a binomial logit model.3 Next, we identify non-

adopters similar to adopters based on the estimated propensity scores to create a control group.

This approach is in line with Rosenbaum and Rubin’s approach to create a control group “that is

similar to a treated group with respect to the distribution of observed covariates” (Rosenbaum

and Rubin 1983). We match each app adopter to a non-adopter based on the 1:1 nearest neighbor

matching without replacement.4 Formally, if P(Xi) is individual i’s propensity score, the treated

individual i is matched to the control individual j, where j is min ||P(Xi) – P(Xj)|| to create

matched pairs closest to each other (Wangenheim and Bayón 2007; Huang et al. 2012).

What factors explain the decision to adopt the app? Consistent with extant literature (Hung et

al. 2003; Kim et al. 2015), we model app adoption as dependent on individual shopper

demographics (e.g., age, gender), behavioral measures (e.g., past spend, past returns, past online

buying) and other related measures (e.g., distance to the nearest store, number of stores in the

shopper’s zip code, presence of competitor stores, loyalty program membership level on the

adoption day) that are likely to influence shoppers.

(2) 𝑃𝑟𝑜𝑝𝑒𝑛𝑠𝑖𝑡𝑦/𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝐴𝑝𝑝 𝐴𝑑𝑜𝑝𝑡𝑖𝑜𝑛 𝐴𝑖 = exp (𝑈𝑖)

1 + exp (𝑈𝑖)

(3) 𝑈𝑖 = 𝛾 + 𝛿𝐷𝑖 + 𝜀𝑖where i is customer, U is the utility from app adoption, D is a vector of covariates, (γ, δ) is a

3 We also estimated propensity scores using a probit model and found no significant difference in our results.4 We present alternative matching estimates, including caliper and Mahalanobis metric in Web Appendix C.

12

coefficient vector, andεis an error term, distributed as double exponential. We also include

squared terms of the covariates to allow for nonlinear relationships and for improved model fit

(Huang et al. 2012).

We conduct a series of statistical analyses to test the goodness of our propensity score

matches, including the Kolmogorov-Smirnov test, Standardized Bias Reduction and

Rosenbaum’s Hidden Bias Sensitivity test (Rosenbaum 2005). The tests show that the match

balance between adopters and non-adopters improved significantly after matching and that there

was no concern for sensitivity of outcomes to hidden bias. The detailed results of these checks as

well as alternative matching methods are reported in Web Appendix C.

4.3.2. Selection on Unobservables: Heckman Correction

To more formally account for the non-randomness of the app adoption due to unobserved

factors, we use a two-stage Heckman correction procedure (Heckman 1979). In general, the rich

set of demographic and behavioral covariates used for matching and the common trends between

treated and control outcomes in the pre-treatment period should offer convincing evidence that

the groups are comparable. However, we further test for any unobserved confounders through

the Heckman procedure (Gill et al. 2016). In the first stage, we model the choice to adopt the app

using a probit model. For identification, we require an exclusion restriction that affects the

decision of the shopper to adopt the app without affecting the shopping outcomes. We identify

three such exclusion restrictions that relate to data connectivity and technology environment:

1. Local wireless network access, operationalized as the proportion of the population in the

shoppers’ counties with access to four or more wireless providers, will likely affect the

shoppers’ mobile usage patterns and hence, shoppers’ probability of downloading apps. If

there is greater access to wireless networks, shoppers are likely to engage in more mobile

13

use and more app download activity regardless of their intrinsic preference for a specific

firm. At the same time, wireless network access is not likely to affect purchases from any

one retailer, in particular, in stores, which serve as the primary channel for the retailer in

this setting. This measure serves as a proxy for unobserved endogenous firm preference

by making app download a function of exogenous network access.

2. Symmetric upload and download speeds, operationalized as the percentage of population

in the shoppers’ state with symmetric Digital Subscriber Lines (DSL) (same download

and upload speed) relative to asymmetric DSL (higher download speeds than upload),

may lead to low adoption of apps in those regions due to slower downloads, without

affecting purchases. This measure serves as a proxy for unobserved endogenous firm

preference by making app downloading a function of exogenous network speeds.

3. Online purchases in the calibration period, operationalized as whether the shoppers used

the online channel to make at least one purchase or not, are likely to affect the shoppers’

perceived value of the mobile app but may not influence how they buy across channels. If

an online purchase was made, on the one hand, it may be valuable for online buyers to

adopt the app to augment their experience. However, since the app does not allow direct

purchase, it may be perceived as less valuable by those who already buy online. This

measure serves as a proxy for shoppers’ tech-savviness.

These exclusion restrictions result in the following first-stage selection equation for modeling Ai,

the probability of shopper i adopting the app.

(4) Pr (𝐴𝑖 = 1| 𝐶𝑜𝑣𝑎𝑟𝑖𝑎𝑡𝑒𝑠𝑖, 𝜖𝑖) = Φ(∂0𝑊𝐼𝑅𝐸𝑁𝐸𝑇𝑖

+ ∂1𝑆𝑌𝑀𝑆𝑃𝐸𝐸𝐷𝑖

+ ∂2𝑂𝑁𝐿𝐼𝑁𝐸𝐵𝑈𝑌𝐸𝑅𝑖

+ ∂3𝑄𝑖

+ 𝜖𝑖)

where WIRENET is local wireless network, SYMSPEED is symmetry of network speed,

ONLINEBUYER is a dummy indicating prior online purchases, Q is a vector of other covariates,

is a coefficient vector, and is an error term. We compute the inverse Mills ratio from this ∂ 𝜖

14

probit regression. In the second stage, we augment the difference-in-differences model by

including the inverse Mills ratio as an additional covariate. In a further robustness check for

selection due to unobservables, we use future app adopters to identify a similar control group for

the current app adopters. The premise for doing this is that there should not be unobserved

differences among app adopters who adopt the app at different points in time. Thus, the DIFF-

IN-DIFF estimating the treatment effects for current app adopter (treated) cohorts relative to

future app adopters (control) across the same time period acts as a falsification test (Manchanda

et al. 2015) and provides consistent results (Web Appendix D).

4.4. Decomposing Treatment Effects: Exponential Type II Tobit Model

While the difference-in-differences model provides estimates for the aggregate effects, it is not

informative about the source of the effects. Do app adopters buy relatively more or less

frequently or do they buy more or less whenever they decide to buy? How do the monetary

values of purchases and returns vary conditional on the decision to purchase or return?

We jointly model the incidence (whether or not to purchase or return) and the monetary value

of purchases and returns. We use an exponential Type II Tobit for the following reasons: (1) we

have a censored model with mass point at zero, (2) our outcome of interest is an empirical

counterpart of a latent variable (utility from purchase or return), and (3) we want our fitted

values to remain in the range of the LDV (limited dependent variable), which in this case is non-

negative for monetary values, and within 0 and 1 for incidence. In the first stage of our model,

we specify a probit model for modeling the binary outcome of whether a shopper purchases

(returns) in a given period. In the second stage, we subsequently model the monetary value of

purchases (returns) per occasion (Wooldridge 2002).

15

We now describe our model setup. A shopper i chooses whether to make a purchase or not at

time t. If the shopper’s expected utility from purchasing is greater than zero, we expect the

purchase incidence to be positive for that time period. The latent utility depends on mobile app

adoption and other covariates5. Mobile app adoption in our sample is exogenous to outcomes

conditional on propensity scores. In other words, we use the propensity score matched samples to

estimate the Tobit models.

Purchase Incidence: Let hit, the purchase incidence of shopper i in period t be given by:

(5) ℎ𝑖𝑡 = 0 𝑖𝑓 ℎ ∗𝑖𝑡 ≤ 0

= 1 𝑖𝑓 ℎ ∗𝑖𝑡 > 0

where is the latent utility of ℎ ∗𝑖𝑡| 𝐴𝑖, 𝑃𝑡, 𝑋𝑖𝑡 = 𝛽𝑃𝐼 + 𝛽1𝐴𝑖 + 𝛽2𝑃𝑡 + 𝛽3𝐴𝑖𝑃𝑡 + 𝛽4𝑋𝑖𝑡 + 𝜀𝑃𝐼

𝑖𝑡

purchasing; X is a vector of covariates, is a coefficient vector, and εPI is an error term. The 𝜷

probability of the ith shopper making a purchase at time t is:

(6) 𝑃𝑃𝐼𝑖𝑡 (ℎ𝑖𝑡 = 1│ 𝐴𝑖, 𝑃𝑡, 𝑋𝑖𝑡) = Φ

(βPI + β1Ai + β2Pt + β3AiPt + β4Xit)𝜎𝑃𝐼

where is σPI is the standard deviation of the error term εPI. We next create our conditional

likelihood function across t periods for any individual i to apply the Maximum Likelihood

Estimator (MLE):

(7)

∫∞‒ ∞{∏𝑇

𝑡 = 1[Φ(βPI + β1Ai + β2Pt + β3AiPt + β4Xit)

𝜎𝑃𝐼]ℎ𝑖𝑡[1 ‒ Φ

(βPI + β1Ai + β2Pt + β3AiPt + β4Xit)𝜎𝑃𝐼

](1 ‒ ℎ𝑖𝑡)}Monetary Value of Purchases: Let t, the monetary value of purchases per purchase 𝑔𝑖𝑡

occasion for shopper i in period t be given by:

5 As a robustness check, we estimated the Tobit models without the covariates for purchase and return amounts, and found consistent estimates for the treatment effect.

16

(8) 𝑔𝑖𝑡 = 1[ℎ ∗𝑖𝑡 > 0]𝑔

∗

𝑖𝑡

𝑤ℎ𝑒𝑟𝑒 𝑔 ∗𝑖𝑡 = exp (𝛽𝑃𝐴 + 𝛽5𝐴𝑖 + 𝛽6𝑃𝑡 + 𝛽7𝐴𝑖𝑃𝑡 + 𝛽8𝑉𝑖𝑡 + 𝜀𝑃𝐴

𝑖𝑡 )

where is the log of monetary value of purchases per purchase occasion for shopper i in time t. 𝑔𝑖𝑡

We observe it when a shopper makes a purchase.6 β is a coefficient vector and εPA is an error

term. The purchase incidence and monetary value form an exponential Type II Tobit model.

Return Incidence: Let rit, the return incidence of shopper i in period t be given by:

(9) 𝑟𝑖𝑡 = 0 𝑖𝑓 𝑟 ∗𝑖𝑡 ≤ 0

= 1 𝑖𝑓 𝑟 ∗𝑖𝑡 > 0

where is the latent utility of 𝑟 ∗𝑖𝑡|𝐴𝑖, 𝑃𝑡,𝑍𝑖𝑡 = 𝛽𝑅𝐼 + 𝛽9𝐴𝑖 + 𝛽10𝑃𝑡 + 𝛽11𝐴𝑖𝑃𝑡 + 𝛽12𝑍𝑖𝑡 + 𝜀𝑅𝐼

𝑖𝑡

returning. Z is a vector of covariates, εRI is an error term, and the other terms are as defined

earlier. The probability of the ith shopper making a return at time t is:

(10) 𝑃𝑅𝐼𝑖𝑡 (𝑟𝑖𝑡 = 1│𝐴𝑖, 𝑃𝑡,𝑍𝑖𝑡) = Φ

(𝛽𝑅𝐼 + 𝛽9𝐴𝑖 + 𝛽10𝑃𝑡 + 𝛽11𝐴𝑖𝑃𝑡 + 𝛽12𝑍𝑖𝑡)𝜎𝑅𝐼

where σRI is the standard deviation of the error term εRI. The conditional likelihood function for t

periods for any individual i conditional on Zit is:

(11) ∫∞

‒ ∞{∏T

t = 1[ΦβRI + β9Ai + β10Pt + β11AiPt + β12Zit

σRI]rit[1 ‒ Φ

( βRI + β9Ai + β10Pt + β11AiPt + β12Zit)σRI

](1 ‒ rit)}

6 V is a vector of shopper covariates, such as income (proxied by average monthly past spending) and tenure (time elapsed since becoming a customer), denoting the income effect on spending and the experience effect on spending, respectively (Thaler 1990; Bolton 1998). Prior research (e.g., Ailawadi and Neslin 1998; Kushwaha et al. 2015) shows that the monetary value of purchases also depends on the inventory effect, modeled as the durability of the product category last purchased. In our context, it would make sense to model video game console buyers differently from games-only buyers because unlike games, consoles are infrequent and high value purchases. We use the data in the calibration period to classify shoppers into console-only buyers, game-only buyers, and buyers of both categories. We find that less than 1% of our sample bought a console. Moreover, 99% of the video game buyers are game-only buyers, possibly because they already own a console. Furthermore, there is no distinction in the percentage of multiple category shoppers between the treatment and control groups; 30% of shoppers in both groups are multi-category buyers. Therefore, we do not control for the inventory effect.

17

Monetary Value of Returns: Let tit, the monetary value of returns per return occasion for shopper

i in period t be given by:

(12) 𝑡𝑖𝑡 = 1[𝑟 ∗𝑖𝑡 > 0]𝑡

∗

𝑖𝑡

𝑤ℎ𝑒𝑟𝑒 𝑡 ∗𝑖𝑡 = exp (𝛽𝑅𝐴 + 𝛽13𝐴𝑖 + 𝛽14𝑃𝑡 + 𝛽15𝐴𝑖𝑃𝑡 + 𝛽16𝑊𝑖𝑡 + 𝜀𝑅𝐴

𝑖𝑡 )

where is the log of monetary value of returns per occasion for shopper i in time t. We observe 𝑡𝑖𝑡

it when a shopper makes a return. W is a vector of covariates and εRA is an error term. Return

incidence and monetary value form an exponential Type II Tobit model. We assume that the

errors in the Tobit models are normally distributed. ), ), 𝜀𝑃𝐼𝑖𝑡 ~ 𝑁 (0, 𝜎 2

𝑃𝐼 𝜀𝑃𝐴𝑖𝑡 ~ 𝑁 (0, 𝜎 2

𝑃𝐴 𝜀𝑅𝐼𝑖𝑡

) and ). ~ 𝑁 (0, 𝜎 2𝑅𝐼 𝜀𝑅𝐴

𝑖𝑡 ~ 𝑁 (0, 𝜎 2𝑅𝐴

A summary of the covariates and their likely relationships with shopping outcomes appears

in Table 6. The table also lists the covariate notations and the relevant supporting research.


5. Results and Robustness Checks

5.1. Results

The results from the baseline difference-in-differences model in Panel (A) of Table 5 show a

positive and significant effect of app adoption on the incidence and monetary value of purchases

and returns (p < 0.001). App adopters spend $42.73 more than non-adopters in the month after

adoption and engage in a higher number of purchases (α3=0.772, p < 0.001). Interestingly,

relative to non-adopters, app adopters also return $9.06 worth more of products and engage in

more number of returns (α3=0.133, p < 0.001) each month.

Panel (B) of Table 5 refines these estimates by controlling for self-selection and ensuring that

the two groups, app adopters and non-adopters, are comparable based on a propensity score

matching model with 1:1 nearest neighbor matching without replacement. App adoption has a

18

positive and significant effect on shopper outcomes, including the number and monetary value of

purchases and returns (p < 0.001). Relative to the baseline model, the coefficients reflect a higher

positive significant effect of app adoption for the matched sample. App adopters spend $47.91

more than non-adopters in the month after adoption and buy a greater number of times

(α3=0.869, p < 0.001). Relative to non-adopters, app adopters also return $10.76 worth more of

products and return products a greater number of times (α3=0.144, p < 0.001) each month.

Overall, app adoption leads to a higher net monetary value of $37.15 (p < 0.001).

The results hold even after we include the Heckman correction term in the model (see panel

(C) of Table 5). We do not find evidence for selection on unobservables as the coefficient of the

inverse Mills ratio is insignificant (p > 0.4). The detailed results of the first stage probit model

appear in Web Appendix D. Histograms depicting the differences in monetary value of purchases

and returns for the matched samples appear in Figure 3. The pattern is similar to that in Figure 2.

< Table 5 and Figure 3 about here >

Histograms showing the distribution of propensity scores for the treated and the control

group before and after matching appear in Figure 4. Matching on propensity scores improves the

percentage balance of propensity scores by 99.32%, making the matched treated and control

groups comparable.7


How do the monetary values of purchases and returns vary conditional on the decision to

purchase or return? The results of the exponential Type II Tobit model (Table 7) provide rich

insights. Interestingly, while app adopters are likely to buy 21% more often, the effect on the

monetary value of purchases per purchase occasion is negative and significant (p < 0.05) relative

7 Appendix C reports the evidence of similarity of the two groups in the mean values of each observed covariate before and after matching, additional checks, and the results of the logit model used to compute propensity scores.

19

to non-adopters. The magnitude is close to 88% implying that adopters in the post period spend

12% less per purchase occasion than they would have in the absence of the app relative to the

pre- period. From the purchase models in panel (A) of Table 7, we further note that recency

negatively influences purchase incidence (p < 0.001) and income and tenure positively influence

the monetary value of purchases per occasion (p < 0.05).


From the exponential Type II Tobit returns model in panel (B) of Table 7, we observe a

negative effect of recency on return incidence (p < 0.001). Distance to the nearest store and the

number of stores in the shoppers’ zip codes do not have significant effects (p > 0.10). Our key

finding from this model is that relative to non-adopters, app adopters are 73% more likely to

return products (p < 0.001). Conditional on return incidence, however, there is no significant

effect on the monetary value of returns per occasion (p > 0.10). We explain the intuition and

possible mechanisms for these results in Section 6.

5.2. Checking Robustness and Ruling out Alternative Explanations

We perform several robustness checks and tests to rule out alternative explanations for the effect

of app adoption on purchase and returns. A summary appears in Table 8.


5.2.1. Alternative measures of app adoption: To rule out idiosyncrasy in the app adoption

measure, we estimated an alternative model with a more nuanced measure of app adoption. In

our main model, we compared app adopters and non-adopters along their outcomes one month

before and after app download. In the alternative model, we apply a fine-grained measure of

mobile app adoption based on app usage. In this test, we create a new quasi-experiment. We

focus on only app adopters in the post-adoption period. We segment adopters into app users and

20

non-users. A user is any app adopter who logs into the app at least once in a given month. What

this means is that the user in month one could be a non-user in the next month. This process acts

as a robustness check in that it shows that the effects are not driven by individual characteristics

over time (Xu et al. 2016). Column (A) of Table 8 reports the results from the comparison of app

users and non-users. Further, we re-estimate this model with propensity score matching, by

matching users and non-users for each month’s activity dynamically. We report the estimates

from four different propensity score matches, those for users and non-users based on usage status

in the four months from January to April 2015 in Web Appendix B; the outcomes are measured

from December 2014 to June 2015. The effects are robust and consistent with previous findings.

5.2.2. Outliers: Another possible explanation for the effects could be outliers, such as the top

spenders (and not the average shopper). We test this possible explanation by first removing the

top spenders from our sample (the mean plus two standard deviations of spending in the pre-

period) and then carrying out propensity score matching and difference-in-differences methods

as done earlier. Our estimates are robust as shown in columns (B) of Table 8. We also test

robustness to outliers based on spending in the calibration period and find consistent results.

5.2.3. Shopper heterogeneity: While we match shoppers on a broad set of covariates, one

possible untested alternative explanation is that the effects are driven largely by already deal-

prone shoppers and not by the use of the app. In general, marketing promotions and offers are

not a threat to our difference-in-differences estimates because the retailer did not send any

unique offers to mobile app adopters that the non-adopters did not receive, or vice-versa. We do

not expect deals to affect one set of shoppers idiosyncratically. Yet, to rule out the possibility

that actual redemption or use of offers prior to adoption could influence the two groups

differently, we repeat the analyses after removing deal-prone customers. Column (C) of Table 8

21

reports the estimates for the sample that did not use deals in the pre-period. We find robust

estimates. We also test robustness to deal-proneness based on offer use in the calibration period

and find consistent results. This finding is managerially significant because app adopters do not

buy more simply because they are sensitive to deals but even otherwise. However, we do notice

that the percentage of instances of deals-usage among shoppers after app adoption increases for

app adopters (6% to 50% deal using shoppers) while remaining stagnant for non-adopters,

suggesting goal-switching effects of app usage (Shankar et al. 2016).

5.2.4. Alternative matching methods: Our main analysis relies on the commonly used 1:1

nearest neighbor matching algorithm. In addition, we use the Mahalanobis metric and a refined

caliper matching approach by defining the bandwidth within which to identify matched control

units (Silverman 1986). We test using a bandwidth of 0.16 times the standard deviation of the

propensity scores, in line with the Silverman rule of thumb. To enhance support for our matches,

we also adopt a trimming approach, in which we drop the observations whose propensity score is

smaller than the minimum and larger than the maximum in the opposite group (Caliendo and

Kopeinig 2005). Web Appendix C reports the results for these alternative matched samples. We

find that these estimates are consistent with those from our proposed method.

5.2.5. Alternative samples: Our main analysis reports the results for a sample of 3,258

shoppers; the treatment group for this random sample comprises those who started using the

mobile app on December 1, 2014. To verify that the results are generalizable to other samples,

we replicated the analyses for two different types of samples: (a) a random sample selected on a

different date, February 1, 2015 (column D of Table 8) and (b) a random sample selected from

each month for a period of four months during (February-May) 2015 (see Web Appendix B for

details). In (b), the treatment group users could have started using the app on any date in that

22

month. We treat the month of adoption as part of the post-treatment period, similar to other

studies (Xu et al. 2016). Our results are consistent across four such samples of 11,380 shoppers.

5.2.6. App novelty effect: An alternative explanation for the increased net monetary value of

purchases after app adoption could be the novelty of the app. It is possible that the app triggers a

heightened shopping response only due to a temporary novelty effect that fades after a few days.

To test this explanation, we re-estimate our models using extended windows of time, that is, 45

days and 60 days, instead of one month. The effects of the app persist in these varying windows

of time, and in fact seem to increase over time (Table B4 in Web Appendix B).

5.2.7. Future adopters as control group: Column E of Table 8 shows the results of an

alternative DIFF-IN-DIFF model that uses future app adopters as a control group for current

adopters. The results are substantively similar.

6. Mechanisms Explaining App Adoption Effects: App Features and Usage Patterns

Our findings provide robust evidence for the influence of mobile app adoption on shopper

behavior. We find that app adopters buy 21% more often but spend 12% less per purchase

occasion and return 73% more often than non-adopters in the month after adoption. Overall, app

adoption results in a 24% increase in net monetary value of purchases.

What mechanisms underlie higher purchases and returns due to app adoption? App adopters’

use of app features may help answer this question. An investigation into the use of app features

shows that two most commonly used features, offer feature (e.g., clicking current deals on

products) and loyalty reward feature (e.g., checking loyalty points) could potentially explain app

adoption effects on shopping outcomes. These features primarily involve an experiential

outcome via interactivity (Bellman et al. 2011). The interactivity of mobile devices is

characterized by the control that users have over the device and the notion of presence, that is,

23

the ability to experience an environment closely through the technology. Activities like

redeeming reward points and activating offers will likely lead to greater engagement and

spending by shoppers (Kim et al 2015). Over 90% app adopters who accessed the app use its

interactive features.

An analysis of app adopters’ use of offer and loyalty reward features is noteworthy because it

helps explain our key finding of lower monetary value per purchase occasion. The descriptive

statistics pertaining to the use of offer and loyalty features pre and post app adoption appear in

Tables 9 and 10, respectively. From these tables, we examine the differences in shopping

outcomes for app adopters who access the offer and loyalty features versus those who did not. As

expected, the value of each purchase for users of these features falls by about 16-20% per

purchase occasion between the pre- and post-period while remaining virtually the same for those

who do not use such features. Furthermore, in the data, shoppers who use the mobile app show

increasing instances of offer usage post adoption from 6% shoppers using offers in the pre-

adoption to over 50% using offers in the post-adoption period. To verify the mechanism of offer

exposure through the app, we further examined the nature of app usage by shoppers on the day

they make a purchase and one day before. Indeed, out of 1,712 transactions made by app users in

the post adoption period, over 42% were triggered after the use of offer-related features and 53%

after the use of loyalty rewards-related features.

< Tables 9 and 10 about here >

If increased offers and rewards exposure is indeed one of the mechanisms for higher

purchase incidence and lower purchase values, we can easily explain higher returns for app

adopters. Higher incidence of returns due to app adoption could result from three related reasons.

First, buying decisions made at the lure of an offer by app adopters may lead to post-purchase

24

disutility and returns. Second, higher return incidence could result from exposure to

disconfirming information after the shopper has made the purchase because app users are likely

to get exposed to negative information about the product through reviews. Our conversations

with managers from the retail company confirmed this insight; the executives revealed that when

social media opinion leaders and influencers share negative reviews about a video game, even if

the game received positive reactions and pre-orders in the pre-release period, it is common to see

spikes in return incidence among buyers. Third, app adopters may be engaging in returns more

often simply because they become less inhibited after using the app; some studies indicate that

the lack of social cues in electronic device mediated communications prompt individuals to

become less inhibited (Sproull and Kiesler 1986).

Finally, contrary to intuition, shoppers who access higher number of unique features do not

always buy more. In fact, beyond a point, their sales and returns outcomes weaken. Figure 5

demonstrates this inverted U-shaped relationship through a scatter plot between users’ average

number of unique features accessed in the app and their shopping outcomes. Shoppers who use a

very high number of features in the app may experience disutility from information overload and

an increased focus and attention on the device itself, rather than on actively thinking about a

purchase and taking actions.


7. Managerial Implications

Our results offer several key managerial implications. First, based on the difference in net

monetary value of purchases due to app adoption from the propensity score matched DIFF-IN-

DIFF model ($37) and the most conservative estimate from the robustness checks ($23), across

the retailers’ two million adopters, we estimate the retailer’s net annual revenue increase due to

25

app launch to range from $550 million to $890 million. This estimate provides a useful

benchmark for managers to evaluate any app introduction decision. This estimate is likely to be

higher if the retailer can convince more shoppers from its 32 million shopper base to adopt its

mobile app.

Second, the findings that the purchase frequency (monetary value of purchases per occasion)

is higher (lower) for adopters than non-adopters suggests that managers should plan for shoppers

visiting the physical and online stores more often and spending less on each occasion. These

findings also suggest reduced interaction of store associates and online agents with shoppers on a

given store visit, so the key task of associates is to encourage shoppers to visit again.

Third, the finding that app adoption leads to greater product returns exposes managers to a

darker side of apps. Managers need to proactively monitor return incidence from app adopters

and devise interventions to keep product returns in check. To the extent that some of the returns

is due the gap between expected and actual product delivered, managers can minimize the gap by

offering clearer pictures, videos, and descriptions of the products in the app.

Fourth, the findings on the role of offers and reward features indicate the importance of

dynamic experiential content in the app that provides additional value to shoppers. To promote

engagement, managers need to ensure that interactive features such as redeeming reward points

and activating offers are easily accessible.

Finally, we caution managers against an all-in-one app design and in favor of a more

thoughtful combination of app features to avoid information-overload. In doing so, managers

should adapt their mobile app design strategies to their context, including the product category.

8. Conclusion, Limitations, and Extension

26

We addressed our three research questions rigorously using two complementary research

designs, a difference-in-differences method and the exponential Type II Tobit model. We tested

a variety of alternative explanations that could contaminate our estimates. First, mobile adoption

leads to higher purchase incidence, return incidence, and net monetary value of purchases.

Second, app adopters buy 21% more often but spend 12% less per purchase occasion and return

73% more often than non-adopters in the month after adoption. Overall, app adoption results in a

24% increase in net monetary value of purchases. Third, experiential app features like

promotional offers and loyalty rewards significantly affect shopping outcomes, and the number

of unique app features accessed by the shopper has an inverted U-shaped relationship with

shopping outcomes.

We tested our results for several alternative explanations including shopper heterogeneity in

deal-proneness. The robustness of our estimates to pre-adoption deal-proneness of shoppers

shows that mobile apps influence non deal-prone shoppers as well. However, once they adopt the

app, exposure to offers and reward features plays an instrumental role in driving app adopters’

shopping outcomes.

Although our research is the first to quantify the effects of mobile apps on a broad range of

shopping outcomes, including returns, it has some limitations that future research can address.

First, our research relies on a quasi-experimental approach. The gold standard for causal

inference is randomized field experiments. Randomized field experiments, if feasible, could

provide future researchers with unique opportunities for testing specific app-related

manipulations. Second, while we have examined the net monetary value of purchases, our data

do not contain cost information. If cost data are available, it would be interesting to study the

effect of app adoption and use on customer lifetime value. Third, we have data from only one

27

retailer across channels. Future studies on mobile apps can examine data on multiple retailers to

map shoppers’ brand loyalty and preference resulting from app adoption. Likewise, future

studies can examine these research questions in the context of other retailer types such as pure

play retailers with a growing bricks-and-mortar presence (e.g., Warby Parker, Bonobos). Such a

setting could also offer interesting comparative insights on the effects of mobile apps on

shopping outcomes in different channels. Finally, there is immense potential to continue to

uncover the mechanisms underlying engagement and use of mobile apps. Furthermore, what

marketing mix strategies should firms adopt to improve adoption of and engagement through

apps? These are ripe areas for future investigation.

28

References

Ailawadi KL, Neslin SA (1998) The effect of promotion on consumption: Buying more and consuming it faster. J. Marketing Res. 35(3):390–398.

Anderson E, Hansen K, Simester D (2009) The option value of returns: Theory and empirical evidence. Marketing Sci. 28(3):405-423.

Andrews M, Goehring J, Hui S, Pancras J, Thornswood L (2016a) Mobile promotions: A framework and research priorities. J. Interactive Marketing 34:15–24.

Andrews M, Luo X, Fang Z, Ghose A (2016b) Mobile ad effectiveness: Hyper-contextual targeting with crowdedness. Marketing Sci. 35(2):218-233.

Bellman S, Potter RF, Treleaven-Hassard S, Robinson JA, Varan D (2011) The effectiveness of branded mobile phone apps. J. Interactive Marketing 25(4):191–200.

Bolton RN (1998) A dynamic model of the duration of the customer’s relationship with a continuous service provider: The role of satisfaction. Marketing Sci. 17(1):45–65.

Caliendo M, Kopeinig S (2005) Some practical guidance for the implementation of propensity score matching. IZA DP N:1588.

Campbell D, Stanley JC (1963) Experimental and Quasi-Experimental Designs for Research (Houghton Mifflin, Boston).

comScore (2016a) comScore reports January 2016 US smartphone subscriber market share. ComScore. Accessed July 12, 2016, http://tinyurl.com/gq3x5s5

comScore (2016b) The 2016 US mobile app report. ComScore. Accessed October 20, 2016, http://tinyurl.com/j43tjyw

Cragg JG (1971) Some statistical models for limited dependent variables with application to the demand for durable goods. Econometrica 39(5):829-844.

Danaher PJ, Smith MS, Ranasinghe K, Danaher TS (2015) Where, when, and how long: Factors that influence the redemption of mobile phone coupons. J. Marketing Res. 52(5):710–725.

Dinner I, Heerde HV, Neslin S (2015) Creating customer engagement via mobile apps: How app usage drives purchase behavior. Working paper.

Dubé JP, Fang Z, Fong NM, Luo X (2015) Competitive price targeting with smartphone coupons. NBER Working Paper No. 22067.

Eadicicco L (2015) More people now shop on Amazon using smartphones and tablets than computers. TIME. Accessed October 2, 2016, http://tinyurl.com/hddh88p

Einav L, Levin J, Popov I, Sundaresan N (2014). Growth, adoption, and use of mobile e-commerce. The American economic rev. 104(5): 489-494.

eMarketer (2014) 2 Billion consumers worldwide to get smart(phones) by 2016. Emarketer. Accessed October 2, 2016, http://tinyurl.com/kkpxevo

eMarketer (2016) Mobile ad spend to top $100 billion worldwide in 2016, 51% of digital market. Emarketer. Accessed August 10, 2016, http://tinyurl.com/p79lymk

Ericsson (2016) Ericsson Mobility Report. Ericsson. Accessed August 10, 2016, http://tinyurl.com/gmnezg6

Fong NM, Fang Z, Luo X (2015) Geo-conquesting: Competitive locational targeting of mobile promotions. J. Marketing Res. 52(5):726–735.

Forbes (2015) How mobile ordering can impact Starbucks’ valuation. Forbes. Accessed October 2, 2016, http://tinyurl.com/zrekrwp

Forrester (2016). 2016 Mobile and app marketing trends. Forrester.Gill M, Sridhar S, Grewal R (2016) On returns to business-to-business mobile engagement apps. Working

paper.Google M/A/R/C Study (2013) Mobile in-store research: How in-store shoppers are using mobile devices.

Google M/A/R/C. Accessed October 2, 2016, http://tinyurl.com/gr7ghpnGuo S, Fraser MW (2014) Propensity Score Analysis: Statistical Methods and Applications (Sage

Publications).

http://tinyurl.com/gq3x5s5

http://tinyurl.com/j43tjyw

http://tinyurl.com/hddh88p

http://tinyurl.com/kkpxevo

http://tinyurl.com/p79lymk

http://tinyurl.com/gmnezg6

http://tinyurl.com/zrekrwp

http://tinyurl.com/gr7ghpn

29

Heckman J (1979) Sample selection bias as a specification error. Econometrica 47(1): 153-161.Huang Q, Nijs VR, Hansen K, Anderson ET (2012) Wal-Mart’s impact on supplier profits. J. Marketing

Res. 49(2):131–143.Hui SK, Inman JJ, Huang Y, Suher J (2013) The effect of in-store travel distance on unplanned spending:

Applications to mobile promotion strategies. J. Marketing 77(2):1–16.Hung SY, Ku CY, Chang CM (2003) Critical factors of WAP services adoption: An empirical

study. Electronic Commerce Research and Applications 2(1):42-60.Jing X, Lewis M (2011) Stockouts in online retailing. J. Marketing Res. 48(2):342–354.Kim SJ, Wang R J-H, Malthouse EC (2015) The effects of adopting and using a brand’s mobile

application on customers’ subsequent purchase behavior. J. Interactive Marketing 31:28–41.Kushwaha T, Shankar V, Li S (2015) Multichannel marketing: Asymmetries across customer-channel

segments and optimal marketing allocation. Working paper.Lee J, Zhuang M, Kozlenkova I, Fang E (2016) The dark side of mobile channel expansion strategies.

MSI working paper.Lewis M, Singh V, Fay S (2006) An empirical study of the impact of nonlinear shipping and handling

fees on purchase incidence and expenditure decisions. Marketing Sci. 25(1):51-64.Manchanda P, Packard G, Pattabhiramaiah A (2015) Social dollars: The economic impact of customer

participation in a firm-sponsored online customer community. Marketing Sci. 34(3):367-387.Ofek E, Katona Z, Sarvary M (2011) “Bricks and clicks”: The impact of product returns on the strategies

of multichannel retailers. Marketing Sci. 30(1):42-60.Peterson H (2015) Macy’s CEO says there’s one thing everyone is getting wrong about the retail industry.

Business Insider. Accessed October 2, 2016, http://tinyurl.com/h7yf6abRosenbaum PR (2005) Sensitivity analysis in observational studies. Everitt BS, Howell DC, eds.

Encyclopedia of Statistics in Behavioral Science (Wiley, New York), 1809-1814.Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for

causal effects. Biometrika 70(1):41–55.Rubin DB (2008) For objective causal inference, design trumps analysis. The Ann. of Appl. Statis.

2(3):808–840.Shankar V, Balasubramanian S (2009) Mobile marketing: A synthesis and prognosis. J. Interactive

Marketing 23(2):118–129.Shankar V, Kleijnen M, Ramanathan S, Rizley R, Holland S, Morrissey S (2016) Mobile shopper

marketing: Key issues, current insights, and future research avenues. J. Interactive Marketing 34:37-48.

Silverman BW (1986) Density Estimation for Statistics and Data Analysis (Chapman and Hall, London).Sims G (2015) Google Play store vs the Apple App store: By the numbers. Android Authority. Accessed

October 2, 2016, http://tinyurl.com/zp4ufdqSproull L, Kiesler S (1986) Reducing social context cues: Electronic mail in organizational

communication. Management sci. 32(11):1492-1512.Thaler RH (1990) Saving, fungibility, and mental accounts. J. Economic Perspectives 4(1):193–205.Wang RJ-H, Malthouse EC, Krishnamurthi L (2015) On the go: How mobile shopping affects customer

purchase behavior. J. Retailing 91(2):217–234.Wangenheim FV, Bayón T (2007) Behavioral consequences of overbooking service capacity. J.

Marketing 71(4):36–47.Wooldridge JM (2002) Econometric Analysis of Cross Section and Panel Data (MIT Press, Cambridge).Xu J, Forman C, Kim JB, Ittersum KV (2014) News media channels: Complements or substitutes?

Evidence from mobile phone usage. J. Marketing 78(4):97–112.Xu K, Chan J, Ghose A, Han SP (2016) Battle of the channels: The impact of tablets on digital

commerce. Management Sci. Forthcoming.

http://tinyurl.com/h7yf6ab

http://tinyurl.com/zp4ufdq

30

Table 1. Selected Related Literature and Our ContributionPaper Focus DV =

PIDV = PA

DV = RI

DV = RA

Other DV

Comprehe-nsive app features

Methods** Context

Prior Research on the Effects of App Adoption on Dependent Variables (DVs*)Bellman et al. (2011)

Effect of app use on brand attitude and purchase intention

Purchase intent

Online survey and lab study

Multiple branded retail apps

Einav et al. (2014)

Analysis of eBay’s mobile app adoption and platform revenues

✔ Descriptive analysis

Online retail

Xu et al. (2014)

Effect of mobile app on demand at the mobile site

Site visit Diff-in-diff Online news

Dinner et al. (2015)

Effect of app adoption on probability of making an online and offline purchase

✔ Fixed effects panel

High-end clothing retailer’s iOS app (online and a store)

Kim et al. (2015)

Effect of use of app check-ins and information look-ups on loyalty point accruals

Loyalty point accruals

PSM and Diff-in-diff

Air Miles Reward Program app

Gill et al. (2016)

Effect of manufacturer's mobile app on B2B revenues

✔ Diff-in-diff B2B engagement app of a tools manufacturer

Prior Research on the Effects of Mobile Device Adoption on Dependent Variables (DVs)Wang et al. (2015)

Changes in customers’ spending behavior upon adopting M-shopping

✔ ✔ PSM, log-log, hazard models

Online grocery retailer’s app

Lee et al. (2016)

Effect of mobile shopping ratio (mobile vs. web) on purchases

✔ ✔ Panel Data Regression

Online retail

Xu et al. (2016)

Effect of tablet adoption on digital commerce via smartphones and PC devices

✔ Diff-in-diff Online retail

Our study

Our paper (2016)

Effects of app adoption on purchase and returns across all channels and the roles of app features on shopping outcomes

✔ ✔ ✔ ✔ - ✔

PSM, Diff-in-Diff, exponential Type II Tobit

Retailer with a chain of stores and an ecommerce site

Notes: *DV refers to the key dependent variables used in the study, including purchase incidence (PI), monetary value of purchase (PA), return incidence (RI), and monetary value of returns (RA); **Methods include difference-in-differences approach (diff-in-diff) and propensity score matching (PSM).

31

Table 2. Variable Definitions and Descriptive Statistics Variable Notation Operationalization Mean St. Dev. Min. Max.

Purchase Incidence PI/h Dummy variable indicating if at least one purchase was made in the time period (=1); else (=0)

0.48 0.50 0 1

Monetary Value of Purchases

PA/g Amount associated with purchases in the period ($) 70.09 154.87 0 3733.89

Return Incidence RI/r Dummy variable indicating if at least one return was made in the time period (=1); else (=0)

0.07 0.25 0 1

Monetary Value of Returns

RA/t Amount associated with returns in the period ($) 5.73 35.91 0 919.96

App Adoption TREAT/A Dummy variable indicating if the shopper adopted the app (=1) or not (=0) in the data period

0.17 0.38 0 1

Time Period POST/P Dummy variable indicating if the time period is before (=0) or after (=1) adoption

0.5 0.5 0 1

Recency RECENCY Number of days since the shopper’s last purchase at the start of time t

36.81 29.01 1 118

Tenure TENURE Number of days of being a customer at the start of time t

453.41 51.33 153 482

Order Size QNT Number of items in an order 0.95 1.38 0 39

Age AGE Age of shopper in years at the start of the data period 32.57 11.13 11 82Gender GENDER Gender of shopper (Female=1, Male=0) 0.21 0.41 0.00 1.00Distance to Nearest Store

DIST Distance in miles between the geographical centers of the shopper’s and the nearest store’s zip codes

4.29 7.86 2.06 196.12

Number of Stores NSTORES Number of focal retailer’s stores in shopper’s zip code 0.57 0.72 0 4Loyalty Program Level

LPROG Dummy indicating if the shopper is enrolled in the basic (=0) or professional (=1) membership on app adoption date

0.43 0.49 0 1

Area Population AREAPOPL Population of zipcode based on 2010 US census 31,611 19,009 6 113,916

Estim

atio

n W

indo

w

Competitor Stores COMPSTORE Number of competing stores in shopper’s zipcode 0.53 0.67 0 5Online Buyer ONLINEBUYER Dummy variable indicating whether the shopper made

an online purchase (=1) or not (=0) in the calibration period

0.05 0.21 0 1

Past Purchase Amount

PASTSPEND Monetary value of average monthly purchases in the calibration period ($)

44.56 62.19 0 844.12

Past Return Amount

PASTRETURN Monetary value of average monthly returns in the calibration period ($)

4.17 17.82 0 482.64

Cal

ibra

tion

Win

dow

Average Purchase Frequency

APF Average number of sales transactions in a month in the calibration period

0.79 0.73 0 15

32

Table 3. Model-free Evidence: Mean Statistics

VariableTreated pre

periodTreated post

periodControl pre

periodControl post

periodPurchase Incidence 0.686 0.822 0.416 0.441Number of purchases 1.474 2.359 0.707 0.820Monetary value of purchases 126.252 181.149 46.887 59.048Return incidence 0.096 0.185 0.049 0.062Number of returns 0.134 0.282 0.062 0.077Monetary value of returns 9.461 18.555 4.021 4.052Net monetary value of purchases 116.792 162.594 42.866 54.995

Table 4. Overview of AnalysesSection Analysis Objective Key insight/Conclusion

4.2 Baseline Difference-in-Differences (DIFF-IN-DIFF) Regression

Quantifying the treatment effect of app adoption on shopping outcomes

App adoption leads to higher incidence and monetary value of purchase and returns than non-adoption

4.3 DIFF-IN-DIFF Regression with(a) Selection on observables –

Propensity Score Matched (PSM) Sample

(b) Selection on unobservables – Heckman correction

Correcting for potential bias in treatment effects due to self-selection

App adoption leads to higher incidence and monetary value of purchase and returns than non-adoption after correcting for endogeneity of app adoption

4.4 Exponential Type II Tobit Decomposing the effects of app adoption into incidence of purchase (returns), and conditional on it, the monetary value of purchase (returns) per occasion in a two-stage model

App adoption leads to higher purchase and return incidence but lower monetary value of purchase per occasion

5.2 Robustness Checks (a) Alternative measure of app

adoption(b) Outliers(c) Customer heterogeneity in

deal proneness(d) Alternative matching(e) Alternative samples(f) App novelty and alternative

time periods

Ruling out alternative explanations for the results

App adoption treatment effects are robust to alternative explanations, such as outliers, customer deal-proneness, app novelty, and other adoption measures, samples, and time periods

Web Appendix

Additional ChecksWeb Appendix B: Visual plots for common trendsWeb Appendix C:(a) Kolmogorov-Smirnov Test(b) Standardized bias reduction(c) Hidden bias sensitivity

analysisWeb Appendix D: Alternative control group using future adopters to tackle unobservables

Evaluating robustness to any other potential threats to main methods

(a) Pre-adoption purchase trends in the control and treated groups are parallel.

(b) PSM significantly improves balance between the treated and control groups.

(c) Effects are robust to alternative control groups.

33

Table 5. Results of Difference-in-Differences ModelsCoeff. (Std. Err.)Panel A. Unmatched Samples

Variable Number of purchases

Monetary value of purchases

Number of returns

Monetary value of returns

Net monetary value of

purchasesTREAT 0.767***

(0.042)79.365***(6.099)

0.072***(0.013)

5.44***(1.24)

73.925***(5.767)

POST 0.113***(0.019)

12.16***(1.953)

0.015*(0.005)

0.032(0.458)

12.129***(1.819)

TREAT*POST 0.772***(0.07)

42.736***(8.637)

0.133***(0.024)

9.063***(2.094)

33.673***(8.009)

Intercept 0.707***(0.013)

46.887***(1.332)

0.062***(0.004)

4.021***(0.348)

42.866***(1.226)

Panel B. Propensity Score Matched SamplesVariable Number of

purchasesMonetary value of

purchasesNumber of

returnsMonetary value of

returnsNet monetary

value of purchases

TREAT 0.499***(0.052)

64.708***(6.837)

0.039*(0.017)

3.205(1.586)

61.503***(6.352)

POST 0.016(0.048)

6.983(4.861)

0.004(0.015)

-1.665(1.269)

8.648(4.441)

TREAT*POST 0.869***(0.083)

47.913***(9.717)

0.144***(0.028)

10.759***(2.405)

37.154***(8.976)

Intercept 0.975***(0.033)

61.545***(3.364)

0.095***(0.011)

6.256***(1.048)

55.289***(2.932)

Panel C. Matched Sample with Heckman Correction using Inverse Mills Ratio (IMR) as CovariateVariable Number of

purchasesMonetary value of purchases

Number of returns


Net monetary value of purchases

TREAT 0.499***(0.052)

64.731***(6.84)

0.039*(0.017)

3.21(1.585)

61.521***(6.355)

POST 0.016(0.048)

6.983(4.862)

0.004(0.015)

-1.665(1.269)

8.648(4.441)

TREAT*POST 0.869***(0.083)

47.913***(9.717)

0.144***(0.028)

10.759***(2.405)

37.154***(8.975)

IMR 0.022(0.095)

5.754(11.22)

0.024(0.03)

1.296(2.791)

4.458(10.321)

Intercept 0.944***(0.14)

53.33**(16.534)

0.06(0.045)

4.406(4.182)

48.924**(15.183)

* Notes: Robust standard errors in parentheses; *** p < 0.001, ** p < 0.01, * p < 0.05.

34

Table 6. Covariates and their Relationships with Outcomes for Exponential Type II Tobit Model Variable Notation PI PA RI RA Support from research in other

contextsMobile app A*P ✔ ✔ ✔ ✔ (Kim et al. 2015; Wang et al. 2015)Recency RECENCY ✔ ✔ (Lewis et al. 2006; Jing and Lewis

2011)Income/Past Spend

INCOME/ PASTSPEND

✔ Thaler (1990)

Tenure TENURE ✔ (Bolton 1998; Kushwaha et al. 2015)Distance to nearest store

DIST ✔ (Anderson et al. 2009; Ofek et al. 2011)

Number of stores

NSTORES ✔ (Anderson et al. 2009; Ofek et al. 2011)

Order size QNT ✔ (Anderson et al. 2009)Note: Purchase incidence (PI), monetary value of purchases (PA), return incidence (RI), and monetary value of returns (RA).

Table 7. Results of Exponential Tobit Type II Model Variable

(A)Coeff.

(Std. Err.)Variable

(B)Coeff.

(Std. Err.)Log Value of Purchases Per Occasion

Log Value of Returns Per Occasion

POST (P) 0.134**(0.046)

POST (P) -0.293*(0.117)

TREAT (A) 0.259***(0.047)

TREAT (A) 0.088(0.117)

TREAT * POST (A * P) -0.125*(0.062)

TREAT * POST (A * P) 0.251(0.171)

TENURE 0.383***(0.102)

QNT 0.05(0.027)

INCOME/PAST SPEND 0.0004*(0.001)

INTERCEPT 3.367***(0.553)

INTERCEPT 1.308*(0.621)

Purchase Incidence Return IncidencePOST (P) -0.014

(0.044)POST (P) 0.081

(0.067)TREAT (A) 0.470***

(0.045)TREAT (A) 0.213**

(0.065)TREAT * POST (A * P) 0.442***

(0.066)TREAT * POST (A * P) 0.308***

(0.088)RECENCY -0.007***

(0.001)DIST -0.005

(0.004)INTERCEPT 0.246***

(0.036)NSTORES -0.007

(0.035)RECENCY -0.007***

(0.001)

Log Likelihood Correlation (rho)

-9359.39 0.171

INTERCEPT

Log LikelihoodCorrelation (rho)

-1.283***(0.061)-2982.730.193

Notes: There are 2,421 (5,827) censored observations for purchase (returns) model; standard errors in parentheses; *** p < 0.001, ** p < 0.01, * p < 0.05.

35

Table 8. Robustness Checks for Treatment Effects

Variable(A)

App use vs. non-use for six months

(B)Outliers

(C)Deal use

heterogeneity

(D)Alternative

sample

(E)Future treated

as control

Number of purchases

0.433***(0.051)

0.88***(0.083)

0.945***(0.089)

0.647***(0.094)

0.688***(0.097)


25.505***(4.934)

87.592***(7.662)

66.781***(11.295)

36.521***(9.064)

34.718**(11.646)

Number of returns 0.055***(0.015)

0.13***(0.028)

0.139***(0.027)

0.073*(0.03)

0.108**(0.036)


2.236*(1.054)

10.81***(1.974)

11.943***(2.528)

8.986***(2.126)

5.25(2.871)


23.269***(4.668)

76.782***(7.113)

54.838***(10.616)

27.536***(8.349)

29.468**(10.664)

Number of observations

9,774 5,828 4,624 4,356 6,516

* Notes: Robust standard errors in parentheses; *** p < 0.001, ** p < 0.01, * p < 0.05.

Table 9. Shopping Outcomes for Subgroups of App Adopters based on Offer Feature UsageVariable Offer features used Offer features not used

Pre Post Pre PostNumber of purchases 1.55 2.61 1.39 2.06Monetary value of purchases 136.27 191.58 114.19 168.59Number of returns 0.16 0.32 0.10 0.23Monetary value of returns 10.80 20.50 7.85 16.22Net monetary value of purchases 125.47 171.08 106.34 152.37Monetary value of purchases per occasion

87.92 73.40 82.15 81.84

Table 10. Shopping Outcomes for Subgroups of App Adopters based on Loyalty Feature UsageVariable Loyalty reward

features usedLoyalty reward

features not usedPre Post Pre Post

Number of purchases 1.50 2.40 1.40 2.26Monetary value of purchases 130.35 178.17 116.62 188.15Number of returns 0.14 0.27 0.13 0.31Monetary value of returns 8.95 17.99 10.67 19.89Net monetary value of purchases 121.40 160.19 105.95 168.26Monetary value of purchases per occasion

86.9 74.23 83.30 83.25

36

Figure 1. Mobile Apps and Shopper Choices

Figure 2. Model-free Evidence: Monetary Value of Purchases and Returns for App Adopters and Non-

adopters

37

Figure 3. Propensity Score Matched Sample: Monetary Value of Purchases and Returns for App Adopters and Non-adopters

Figure 4. Distribution of Propensity Scores Pre- and Post-Matching

Figure 5. Model Free Evidence: Number of App Features and Monetary Value of Purchases and Returns

Number of unique app features usedNumber of unique app features used

Mon

etar

y V

alue

of P

urch

ases

($)

Mon

etar

y V

alue

of R

etur

ns ($

)

i

The Effects of Mobile Apps on Shopper Purchases and Product Returns

WEB APPENDIX

Web Appendix A. Screenshots

Figure A1. App Screenshots on iPhone

Web Appendix B. Robustness Check for Alternative Samples, Periods and Common Trends in the Pre-period

In this section, we present the results for robustness checks relating to two alternative samples

(Tables B1-B2), alternative app adoption measures (Table B3), varying time periods (Table B4)

and the common trends plots (Figures B1-B4) for the treated and control groups.

The two samples are: (a) a random sample selected on a different date, February 1, 2015

similar to our main analysis of December 1, 2014 with app adopters selected at a certain date of

adoption, and (b) random samples selected from each month for a period of four months during

(February-May) 2015. In case (b), we treat the month of adoption as the post-period; the implicit

assumption is that the shoppers who adopted the app did so at the beginning of the month. Such

an aggregation approach will induce a downward bias in our estimates, since we would assume

ii

that shoppers start showing signs of increased spending right at the beginning of the period

(Manchanda et al. 2015).

Similar to our main estimation, we match app adopters and non-adopters using a rich set of

covariates in a binary logit model and subsequently carry out a difference-in-differences

estimation. The binary logit model specifications are tailored for best fit. For instance, for the

February 2015 sample, we matched the samples on each past month’s spending instead of

average past spending. We also replicate the analysis for an alternative control group sample of

random non-adopters. In Tables B1 and B2, we present the results for the two sets of samples

described earlier.

Next, in Table B3, we report the estimates for a refined measure of app adoption – app use

vs. non-use based on December adopters’ usage in four months from January to April, 2015.

These estimates demonstrate that the effect is indeed due to app use, and is robust to individual

characteristics over time. Finally, in Table B4, we report the estimates for varying time windows

to rule out a possible novelty effect of the app. More specifically, we find robust results using

shorter (15-day) and longer (45- and 60-day) periods as the pre-post windows compared to our

30-day window for the main estimation.

In Figures B1-B4, we present the graphs showing the monetary values of purchases for

adopters and non-adopters before app adoption to illustrate that the common trends assumption

central to the difference-in-differences design holds.

iii

Table B1. Difference-in-Differences Model Results for Feb 1, 2015 SampleVariable (A)

Nearest neighbor matches

(B)Caliper matches

(C)Excluding

outliers

(D)Excluding offer users

Treatment Effect Coeff. (Std. Err.)Number of purchases 0.647***

(0.094)0.6153***(0.0887)

0.6417***(0.092)

0.5569***(0.0952)

Monetary value of purchases 36.521***(9.064)

34.7187***(8.5425)

41.6567***(8.7232)

29.1181**(9.1233)

Number of returns 0.073*(0.03)

0.088**(0.0287)

0.0623*(0.0299)

0.0449(0.0309)

Monetary value of returns 8.986***(2.126)

7.3061***(2.0938)

5.8777**(2.1032)

5.887**(2.1571)

Net monetary value of purchases 27.536***(8.349)

27.4126***(7.8975)

35.779***(8.084)

23.2311**(8.4989)

Number of individuals 2,178 2,090 1,926 1,828Notes: *** p < 0.001, ** p < 0.01, * p < 0.05; robust standard errors are in parentheses.

Table B2. Difference-in-Differences Model Results for February-May 2015 SamplesVariable (A)

February adopters

(B)March

adopters

(C)April

adopters

(D)May

adoptersTreatment Effect Coeff. (Std. Err.)

Number of purchases 1.191***(0.0862)

1.032***(0.0936)

1.088***(0.0818)

1.152***(0.0838)


107.667***(9.8333)

86.048***(18.8776)

71.956***(7.2029)

Number of returns 0.101**(0.0335)

0.175***(0.0332)

0.131***(0.0299)

0.139***(0.026)

Monetary value of returns 8.658*(4.1254)

18.449***(3.2732)

7.229**(2.6187)

8.411***(2.1607)


Number of individuals

111.894***(7.4844)

3,180

89.217***(8.5666)

2,750

78.819***(18.5387)

2,804

63.546***(6.4654)

2,646Notes: *** p < 0.001, ** p < 0.01, * p < 0.05; robust standard errors are in parentheses.

iv

Table B3. Results of Alternative Model Comparing App Users and Non-users Variable (A)

Jan 2015 matches

(B)Feb 2015 matches

(C)March 2015

matches

(D)April 2015

matches

Effect of App Use Coeff. (Std. Err.)Number of purchases 0.805***

(0.052)0.786***(0.053)

0.844***(0.061)

0.791***(0.054)


58.268***(5.179)

59.120***(5.715)

58.938***(5.883)


0.112***(0.016)

0.133***(0.018)

0.083***(0.016)

Monetary value of returns 6.939***(1.194)

9.418***(1.544)

7.665***(1.372)

6.840***(1.543)


53.883***(4.421)

48.849***(4.567)

51.455***(5.234)

52.098***(5.378)

Number of individualsNumber of observations

3484,872

2984,172

2733,822

2343,276

Notes: *** p < 0.001, ** p < 0.01, * p < 0.05; robust standard errors are in parentheses.

Table B4. Difference-in-Differences Model Results for Varying Periods Based on Time from Adoption

Variable (A)15 days pre

and post

(B)45 days pre

and post

(C)60 days pre

and post

Treatment Effect Coeff. (Std. Err.)Number of purchases 0.556***

(0.054)1.304***(0.1)

1.629***(0.116)


75.089***(10.847)

92.645***(11.83)


0.225***(0.034)

0.247***(0.037)

Monetary value of returns 5.773**(1.752)

15.713***(2.859)

17.41***(3.039)


27.033***(7.11)

59.376***(9.945)

75.235***(10.836)

Notes: *** p < 0.001, ** p < 0.01, * p < 0.05; robust standard errors are in parentheses; number of individuals for these models is the same as the main sample, that is, 3,258 treated and control individuals.

v

Figure B1. Purchase Trends Before App Adoption for the February 2015 Sample

Aug-14 Sep-14 Oct-14 Nov-14 Dec-14 Jan-150

20

40

60

80

100

120

Non Adopters

Adopters

Month

Mon

etar

y V

alue

of

Purc

hase

s ($)

Figure B2. Purchase Trends Before App Adoption for the March 2015 Sample

Aug-14 Sep-14 Oct-14 Nov-14 Dec-14 Jan-15 Feb-150

102030405060708090

100

Non Adopters

Adopters

Month

Mon

etar

y V

alue

of

Purc

hase

s ($)

vi

Figure B3. Purchase Trends Before App Adoption for the April 2015 Sample

Aug-14 Sep-14 Oct-14 Nov-14 Dec-14 Jan-15 Feb-15 Mar-150

20

40

60

80

100

120

Non Adopters

Adopters

Month

Mon

etar

y V

alue

of

Purc

hase

s ($)

Figure B4. Purchase Trends Before App Adoption for the May 2015 Sample

Aug-14 Sep-14 Oct-14 Nov-14 Dec-14 Jan-15 Feb-15Mar-15 Apr-150

102030405060708090

100

Non Adopters

Adopters

Month

Mon

etar

y V

alue

of

Purc

hase

s ($)

Appendix C. Tests for Propensity Score Matching

First, we report the results of the binomial logit model of app adoption used to compute

propensity scores. The results in Table C1 present the logit model coefficients for the likelihood

of a shopper becoming an app adopter. Shoppers who are more likely to adopt the retailer’s app

tend to be younger, male, online buyers, paid loyalty members on the day of app adoption, and

vii

higher frequency shoppers. We select this logit model after evaluating the model fit for several

other model specifications, including probit and logit with and without non-linear covariates.

Next, we present the results of various post-matching checks. First, the Kolmogorov-Smirnov

(Table C2) test shows that the distributions of propensity scores of the matched treated and

control groups are statistically similar. Second, percentage reduction in bias after matching

shows significant improvements in the values of covariates across app adopters and non-

adopters, thus making the two groups comparable (Table C3). Finally, there is no concern for

potential hidden bias due to unobservables (Table C4) or concerns for alternative matching

methods (Table C5).

We discuss these checks in detail next. The results in Table C2 show that the distribution of

propensity scores is nearly identical after matching. Table C3 shows the standardized bias before

and after matching. We calculate it as follows (Rosenbaum and Rubin 1983):

𝑆𝐵 = 𝑋𝑡 ‒ 𝑋

(.5) x (𝑆𝐷 2𝑋𝑡 + 𝑆𝐷2

𝑋)

where for the standardized bias before matching SBBM, the numerator is the difference between

value of X covariates for the treated individuals before matching and the X covariates for all

unmatched control individuals before matching, and the denominator is the equally weighted

variance of the two. Likewise, the standardized bias after matching SBAM uses the treated means

and control means after matching. We then calculate the percentage reduction in bias as:

𝑃𝑅𝐵 = 100(1 ‒𝑆𝐵𝐴𝑀

𝑆𝐵𝐵𝑀)

In Table C4, we summarize the results of a sensitivity test to assess if hidden bias is a cause for

concern. In this test, we manipulate the estimated odds of receiving the treatment to see how

much the estimated treatment effects may vary. In other words, we check that the estimates are

viii

robust to possible ranges of “hidden bias.” According to Rosenbaum (2005), a sensitivity

analysis in an observational study asks what the unmeasured covariate would have to be like to

alter the conclusions of the study.

Suppose we have two individuals j and k. Assume they have similar covariates but different

chances of receiving the treatment. In other words, Xj and Xk are the same but Aj and Ak, the

probability of adopting the app, may be different. The odds that they adopt the app are Aj/(1-Aj)

and Ak/(1-Ak). Let us assume the odds ratio to be at most gamma, where gamma is the exponent

of delta, the indicator of hidden bias. For various gamma values starting with one, we calculate

bounds or intervals of p-values that show us the uncertainty due to hidden bias.

1

𝑒Δ ≤𝐴𝑗(1 ‒ 𝐴𝑘)

𝐴𝑘(1 ‒ 𝐴𝑗)≤ 𝑒Δ

Let γ = eΔ. If gamma were exactly one, or equivalently delta exactly zero, then there would

be no hidden bias and if Xj and Xk are equal then so would be their log odds of getting treated.

Gamma is a measure of degree of departure from a study free of hidden bias (Guo and Fraser

2014). Our test finds that for varying values of gamma, 1 through 2, our inference is robust or the

study is insensitive to hidden bias. In other words, extremely high values of gamma would be

needed to change the inference.

ix

Table C1. App Adoption Model: Logit EstimatesVariable Coeff. (Std. Err.)

1 (Intercept) 67.187* (26.133)2 ln(1+AGE) -0.872*** (0.085)3 GENDER (F=1,M=0) -0.454*** (0.079)4 ln(1+DIST) 0.584* (0.252)5 ln(1+NSTORES) 1.122 (0.609)6 ln(1+AREAPOPL) 0.091 (0.359)7 ln(1+COMPSTORE) -0.044 (0.09)8 ln(1+TENURE) -24.709** (9.166)9 LPROG 0.549*** (0.058)10 ONLINEBUYER 0.424*** (0.116)11 ln(1+PASTSPEND) -0.098 (0.125)12 ln(1+PASTRETURN) -0.117 (0.086)13 ln(1+NSTORES) Sq. -0.538 (0.356)14 ln(1+AREAPOPL) Sq. -0.004 (0.019)15 ln(1+TENURE) Sq. 2.226* (0.801)16 ln(1+DIST) Sq. -0.135* (0.058)17 ln(1+PASTSPEND) Sq. 0.006 (0.018)18 ln(1+PASTRETURN) Sq. 0.024 (0.023)19 ln(1+APF) 2.371*** (0.317)20 ln(1+APF) Sq. -0.507** (0.168)

Notes: Null deviance: 8,737.9 on 9584 degrees of freedom; Residual deviance: 8,109.8 on 9,565 degrees of freedom; AIC: 8,149.8, Number of Fisher Scoring iterations: 5. Log likelihood: -4054.91. McFadden’s Pseudo R squared 0.072; *** p < 0.001, ** p < 0.01, * p < 0.05.

Table C2. KS Test ResultsTwo sample Kolmogorov-Smirnov test

Before matching: D = 0.29467, p-value < 2.2e-16After matching: D = 0.004911, p-value = 1

x

Table C3. Propensity Score Matching Results: Percentage Reduction in Bias After MatchingVariable Means treated

(before matching)

Means control (before

matching)

Means control (after

matching)

Percent balance

improvement

(Intercept) 0.227 0.158 0.226 99.318ln(1+AGE) 3.379 3.476 3.380 98.467GENDER (FEMALE) 0.135 0.225 0.141 93.883ln(1+DIST) 1.053 1.050 1.041 -269.836ln(1+NSTORES) 0.357 0.359 0.352 -136.892ln(1+AREAPOPL) 10.100 10.092 10.127 -240.089ln(1+COMPSTORE) 0.342 0.343 0.334 -472.658ln(1+TENURE) 6.086 6.074 6.087 87.540LPROG 0.561 0.400 0.570 94.299ONLINEBUYER 0.079 0.040 0.076 92.149ln(1+PASTSPEND) 3.532 3.175 3.528 98.853ln(1+PASTRETURN) 0.604 0.411 0.642 80.196ln(1+NSTORES) Sq. 0.302 0.303 0.296 -326.628ln(1+AREAPOPL) Sq. 102.810 102.676 103.290 -257.693ln(1+TENURE) Sq. 37.053 36.919 37.069 88.140ln(1+DIST) Sq. 2.239 2.260 2.145 -349.191ln(1+PASTSPEND) Sq. 13.699 11.196 13.687 99.507ln(1+PASTRETURN) Sq. 1.957 1.268 2.063 84.568ln(1+APF) 0.655 0.501 0.649 96.139ln(1+APF) Sq. 0.557 0.327 0.541 93.251

Notes: 1,629 adopters are matched with 1,629 non-adopters out of a pool of 7956 non-adopters pre-matching.

Table C4. Hidden Bias Sensitivity Test ResultsRosenbaum Sensitivity Test for Wilcoxon

Signed Rank P-ValueUnconfounded estimate.... 0

Gamma Lower bound Upper bound

1 0 01.1 0 01.2 0 01.3 0 01.4 0 01.5 0 01.6 0 01.7 0 01.8 0 01.9 0 02 0 0

xi

Table C5. Results of Difference-in-Differences Model with Different Matching MethodsVariable (M1)

Matching (Mahalanobis

metric)

(M2) Matching (calipers)

(M3) Matching (common

support with trimming)

Treatment Effect Coeff. (Std. Err.)Number of purchases

0.799***(0.083)

0.823***(0.082)

0.899***(0.084)


39.61***(9.785)

45.294***(9.485)

48.157***(9.949)


0.137***(0.027)

0.14***(0.027)

Value of returns 9.446***(2.286)

11.109***(2.253)

10.052***(2.337)


30.164**(9.092)

34.186***(8.862)

56.777***(6.583)

Number of observations

6,516 6,404 6,484

Note: *** p < 0.001, ** p < 0.01.

Appendix D. Selection on Unobservables

In this section, we present (a) the results of the first-stage probit model used for the Heckman

correction in Table D1, and (b) the results for an alternative difference-in-differences using

future treated cohorts of app adopters as controls in Table D2.

Table D1. First-Stage Probit Model ResultsDV = App Adoption

Variable Coeff.(Std. Err.)

Wireless network (WIRENET)* 0.196** (0.092)Online buying (ONLINEBUYER)* 0.381***(0.066)Symmetry in upload and download speeds (SYMSPEED)*

-0.255** (0.103)

Age -0.015*** (0.001)Gender -0.256*** (0.042)Tenure 0.002*** (0.0000)Distance -0.002 (0.002)Loyalty program level 0.35*** (0.031)Competitor stores -0.017 (0.024)Intercept -1.152*** (0.14)

Notes: *** p < 0.01, ** p < 0.05; * indicates exclusion restrictions.

xii

Table D2. Alternative Difference-in-Differences Model Results with Future App Adopters as Control Group

Variable Unmatched sample

Matched sample

Treatment Effect Coeff. (Std. Err.)

Number of purchases 0.758****(0.076)

0.688****(0.097)

Monetary value of purchases 37.236**(11.749)

34.718***(11.646)

Number of returns 0.122****(0.026)

0.108***(0.036)

Monetary value of returns 7.216****(2.207)

5.25*(2.871)

Net monetary value of purchases 30.021*(11.236)

29.468***(10.664)

Notes: **** p < 0.001, *** p < 0.01, ** p < 0.05, *p < 0.10; in this method, future app adopters from Feb-May 2015 are used as controls for current app adopters from December 2014.