Market Risk Disclosure and Crash Risk: Evidence from Textual Analysis Risk... · 2019-01-18 · Evidence from Textual Analysis : Abstract : The usefulness of market risk disclosures,

Market Risk Disclosure and Crash Risk: Evidence from Textual Analysis

Shingo Goto College of Business

The University of Rhode Island Ballentine Hall, 7 Lippitt Road,

Kingston, RI 02881 [email protected]

Xin Luo College of Business Administration

Marquette University 1225 W. Wisconsin Ave,

Milwaukee, WI 53233 [email protected]

Zhao Wang College of Business

The University of Rhode Island Ballentine Hall, 7 Lippitt Road,

Kingston, RI 02881 [email protected]

October 2018

Abstract

The usefulness of market risk disclosures, mandated by SEC in Item 7A of Form 10-K, has

been a subject of ongoing debate. Using Latent Dirichlet Allocation, we cluster the textual

information in Item 7A into eight latent topics. Disclosures of two topics, both related to risks

in input costs and output prices, are associated significantly with lower future stock price crash

risk. One of them conveys significant information about future crash risk among low-accrual

firms, while the other does so among high-accrual firms. Overall, this study provides a strong

case against the argument that Item 7A provides only boilerplate information.

Keywords: Disclosure; Risk; Market Risk Disclosure; Crash Risk; Latent Dirichlet

Allocation; LDA; Textual Analysis; Language Processing; Boilerplate; Item 7A.

mailto:[email protected]



1

Market Risk Disclosure and Crash Risk:

Evidence from Textual Analysis

Abstract

The usefulness of market risk disclosures, mandated by SEC in Item 7A of Form 10-K, has been

a subject of ongoing debate. Using Latent Dirichlet Allocation, we cluster the textual information

in Item 7A into eight latent topics. Disclosures of two topics, both related to risks in input costs

and output prices, are associated significantly with lower future stock price crash risk. One of

them conveys significant information about future crash risk among low-accrual firms, while the

other does so among high-accrual firms. Overall, this study provides a strong case against the

argument that Item 7A provides only boilerplate information.

Keywords: Disclosure; Risk; Market Risk Disclosure; Crash Risk; Latent Dirichlet Allocation;

LDA; Textual Analysis; Language Processing; Boilerplate; Item 7A.

2

Investors, financial analysts, regulators, and other market participants generally agree on the

necessity of improving the quality of disclosures that firms make to the public about their

exposures to market risk, including interest rate risk, foreign currency exchange rate risk,

commodity price risk, equity price risk, and so on. Enhancing the quality of market risk disclosures

should help investors improve the process of security valuation and analysis (CFA Institute 2016)

and reduce investors’ panics and sensitive trading behaviors in response to unfavorable changes in

market conditions (e.g., Rajgopal 1999; Linsmeir et al. 2002; Thornton and Welker 2004).

With the objective of enhancing the quality of corporate financial disclosures, the US

Securities Exchange Commission (SEC) issued Financial Reporting Release (FRR) No. 48 in

1997, which mandates firm managers to narratively disclose quantitative and qualitative

information about the risk of loss arising from market prices and rates for debt, equity, currency,

commodity, and other traded instruments as well as their derivatives in Item 7A of Form 10-K

filings.1 Managers should also describe how they manage or hedge the risk exposures associated

with market fluctuations.

While more than 20 years have passed since the SEC mandated narrative market risk

disclosures in Item 7A, people have different views on the usefulness of Item 7A. For example,

investors and financial analysts surveyed by CFA Institute (2016) generally express a low level of

satisfaction with current market risk disclosures, owing to a large amount of boilerplate

information in narrative market risk disclosures.2 There are widely-held perceptions that, while

1 These disclosure requirements are described in Item 305 of Regulation S-K introduced under the Securities

Exchange Act of 1933. In particular, mandated market risk exposures prompt the firms to provide information about any market risk sensitive instruments that affect the firms’ financial conditions, in conjunction with the disclosure of quantitative information about market risks of derivatives and hedging (SFAS No.133, No.161). Please see Kawaller (2004) for an introduction to the disclosure of derivatives under SFAS No. 133.

2 While CFA Institute’s (2016) discussion mainly surrounds the financial instrument risk disclosure under International Financial Reporting Standards (IFRS), market risk disclosure requirements are very similar between IFRS and US GAAP. Please see KPMG (2017), especially pp.496-498.

3

firms’ market risk disclosures may have the appearance of valid disclosures, in actuality many of

them only provide routine, redundant, non-specific, and boilerplate information (e.g., Abraham

and Shrieves 2014; CFA Institute 2016; SEC 2016). In a recent concept release, SEC (2016) calls

for further discussions on the usefulness of current market risk disclosures and on strategies to

improve their informativeness.3

Given the ongoing discussions of the usefulness of Item 7A, this study undertakes a question

whether Item 7A provides significant information to the public. Specifically, we ask how a firm’s

narrative disclosure in Item 7A is related to the firm’s future stock price “crash risk” (Chen et al.

2001). We focus on the relationship between risk disclosure and future crash risk in this study

because crash risk captures a dimension of downside risk that most market participants and firm

managers are concerned about. Crash risk often reflects the effects of negative jumps in stock

prices (e.g., Jin and Myers 2006; Hutton et al. 2009) that are difficult to contain once they occur.

As such, any information that helps predict or mitigate future crash risk should have significant

economic values for market participants (e.g., Xiong et al. 2016).

A significant challenge in analyzing the information content of Item 7A is the large degree of

discretion each firm manager has on the content and the lack of compatibility in narrative market

risk disclosures across firms (Hodder and McAnally 2001). The textual information in Item 7A is

highly unstructured and difficult to classify, quantify, and compare across firms. To this end, we

apply a Latent Dirichlet Allocation (LDA) model (Blei et al. 2003; Bao and Datta 2014) to tease

out latent thematic information (topics) in Item 7A. LDA is essentially a Bayesian text clustering

3 Mr. Wesley Bricker, the Chief Accountant at SEC, recently emphasized the importance of firm managers’

attention and financial reporting oversight of firms’ market risk disclosures. Please see his remarks before the 2018 Baruch College Financial Reporting Conference: “Working Together to Advance Financial Reporting” on May 3, 2018, in New York, NY. https://www.sec.gov/news/speech/speech-bricker-040318#_ftn9.

https://www.sec.gov/news/speech/speech-bricker-040318#_ftn9

4

algorithm that simultaneously uncovers and quantifies latent topics in text documents, without

imposing a set of predefined keywords or topics. Under the assumption that a firm's narrative

disclosure is a finite mixture of topics each of which is a distribution over a fixed vocabulary of

unspecified keywords, LDA classifies narrative market risk disclosures into a small number of

latent topics. We then examine how the disclosure of each latent topic is related to future crash

risk. As we discuss below, we find that some uncovered latent topics in Item 7A convey significant

information about the firms’ future stock price crash risk.

This study makes a few important steps to improve our understanding of the role and effects

of narrative market risk disclosures in Item 7A. First of all, to our knowledge, this is the first study

that investigates the relationship between mandatory market risk disclosures and future stock price

crash risk. Second, this study contributes to the ongoing regulatory debate over the usefulness of

mandatory market risk disclosures. Specifically, evidence of this study provides a strong case

against the argument that Item 7A merely provides boilerplate information. Finally, while it is not

necessarily easy to analyze and compare narrative market risk disclosures across firms in the

presence of boilerplate information, this study implements an LDA text clustering algorithm to

uncover significant information in Item 7A that helps investors improve the process of security

analysis and valuation. The information helps reduce investors’ panics and sensitive trading

behaviors in response to unfavorable changes in market conditions, thereby curtailing crash risk.

MAIN FINDINGS AND INTERPRETATIONS

The Information Content of Narrative Market Risk Disclosures

We find that two latent topics in narrative market risk disclosures, which are related to risk

exposures in the firms’ input costs and output prices, are negatively and significantly correlated

5

with future crash risk. Corroborating earlier studies that demonstrate the informativeness of market

risk disclosures (e.g., Rajgopal 1999; Linsmeir et al. 2002; Thornton and Welker 2004), this study

provides a strong case against the commonly-cited argument that narrative market risk disclosures

in Item 7A provide only boilerplate information. While a large proportion of narrative disclosures

appear to be fairly generic or non-specific, some disclosures convey useful information to the

public.

To proceed with the analysis, we define and measure stock price “crash risk” in two ways.

Following Chen et al. (2001), one of the two measures is the “negative coefficient of skewness of

firm-specific daily returns” (NCSKEW), and the other is the “down-to-up volatility” (DUVOL)

which is the log ratio of the return volatility in the down days to the return volatility in the up days.

Higher values of NCSKEW and DUVOL indicate larger downside risk for a given level of volatility,

thereby measuring the degree of the stock price crash risk. 4

To tease out the highly unstructured textual information in narrative market disclosures, we

follow Bao and Datta’s (2014) implementation of LDA. In a sample of US firms between 2002

and 2016, we uncover eight latent topics in their narrative market risk disclosures in Item 7A. We

then classify and quantify each firm’s narrative market disclosure by (i) topic assignments (i.e.,

which topics does the firm discuss in Item 7A?) and (ii) topic proportions (i.e., what are the

proportions of topics discussed in the firm’s Item 7A?). We then examine if the assignment and

the proportion of each topic are correlated with two measures of future stock price crash risk,

NCSKEW or DUVOL.

4 As Chen et al. (2001) note, we are adopting a narrow and euphemistic definition of “crashes” by associating it

solely with two measures of negative skewness of the return distribution. The two “crash risk” measures have been widely adopted in the literature. Please see Jin and Myers (2006), Hutton et al. (2009), Kim et al. (2011), Callen and Fang (2013, 2015a, 2015b, 2017), Kim et al. (2014), Kim and Zhang (2016), among many others.

6

Only two out of the eight topics emerge as significant in predicting future crash risk. The two

informative topics are related to exposures that affect the firms’ output prices and/or input costs.

The first of the two is associated with “commodity price risk and derivatives.” The other

informative topic is associated with “risks in product prices and materials costs,” that have

significant effects on the firms’ future sales and operating costs. Disclosures of these topics,

especially the former, are negatively and significantly correlated with future crash risk. In

particular, the disclosure of “commodity price risk and derivatives” predicts future crash risk

beyond the variables that have shown to be strong predictors of crash risk (e.g., Chen et al. 2001;

Jin and Myers 2006; Hutton et al. 2009; Callen and Fang 2013; and Kim and Zhang 2016).

One consistent interpretation, along the line of the recent risk disclosure theory of Heinle and

Smith (2017) and Heinle et al. (2018), is that disclosures of these topics help reduce the perceived

uncertainty surrounding the firms’ future cash flow volatilities (not the level of volatilities), when

the firms’ cash flows are sensitive to changes in input (materials) costs and changes in output

(product) prices. Firms with large commodity price risk exposures tend to have large uncertainty

in the volatility of their cash flows with large downside risk. Some firms actively engage in

hedging activities to reduce their cash flow volatilities while others do not. Discussions of

“commodity price risk and derivatives” and “risks in product prices and materials costs” in Item

7A appear to help reduce the perceived uncertainty surrounding the risk exposures of the firms’

future cash flows, which also help curtail the firms’ future crash risk.

Although it is difficult to draw an unambiguous causal conclusion about the effects that drive

our results, our evidence provides a clear case against the null hypothesis that Item 7A contains

only boilerplate information. In particular, narrative disclosures of risk exposures related to

“commodity price risk and derivatives” and “risks in product prices and materials costs” convey

7

useful information about the firms’ future crash risk, suggesting the informativeness of some

narrative market risk disclosures in Item 7A.

In contrast, disclosures of the other six topics do not convey significant or consistent

information about future crash risk. These topics are apparently related to disclosures of interest

rate risk and foreign currency exchange rate risk that are discussed in different contexts. While we

refrain from making strong claims from out textual analysis, many disclosures of these risk

exposures may not be specific enough for investors to remove their perceived uncertainty about

the volatility of the firms’ future cash flows.

Market Risk Disclosure, Earnings Management, and Crash Risk

The recent literature emphasizes the behavior of entrenched managers to delay the release of

bad news (the bad news hoarding behavior) as a prominent driver of stock price crash risk. (Please

see the next section for a review.) Consistent with this explanation, the literature has shown that

aggressive earnings management with accruals is positively associated with crash risk (Jin and

Myers 2006; Hutton et al. 2009) while accounting conservatism is negatively associated with crash

risk (Kim and Zhang 2016). Our evidence also confirms that accruals tend to be positively

associated with future crash risk as measured by NCSKEW.

If narrative market risk disclosures help mitigate the managers’ bad news hoarding behavior,

they should help reduce the firms’ future crash risk more among firms with more aggressive

earnings management. To see if/how our textual analysis supports this argument, we group our

sample into two subsamples sorted by the degree of earnings management, as measured by

working capital accruals (Frankel and Sun 2018). Our subsample analysis offers a couple of

empirical insights into the two latent topics that are correlated with future crash risk.

8

First, among firms with high accruals, the disclosure (topic assignment) of “commodity price

risk and derivatives” is not significantly correlated with future crash risk. The negative correlation

between the disclosure (topic assignment) of “commodity price risk and derivatives” is

concentrated among firms with low accruals. Within the subsample of firms with low accruals,

firms that discuss the “commodity price risk and derivatives” has significantly lower future crash

risk than those that do not. We do not see similar differences for the proportion of the same topic

in Item 7A.

Second, we find that the proportion of the discussion (but not the assignment) of “risks in

product prices and materials costs” in Item 7A is negatively correlated with future crash risk

among firms with high accruals, but not among firms with low accruals. Within the subsample of

firms with high accruals, firms that provide the relatively voluminous discussion of their “risks in

product prices and materials costs” tend to have lower future crash risk.

Although this simple subsample analysis would not allow us to draw a strong causal

conclusion about the effects of market risk disclosures on future crash risk, our findings are

consistent with the following interpretations:

• Disclosures of the “commodity price risk and derivatives” help lower future crash risk not

because they mitigate the bad news hoarding behavior (aggressive earnings management)

by entrenched managers, but because they reduce the uncertainty surrounding the firms’

cash flow volatilities, consistent with the theory put forth by Heinle and Smith (2017) and

Heinle et al. (2018). The significant correlation between the disclosure of “commodity

price risk and derivatives” and future crash risk is thus concentrated among firms that

maintain relatively transparent financial disclosures.

• Among the firms with high accruals, those providing detailed disclosures of “risks in

9

product prices and materials costs” may invest more heavily in building inventories and

other working capital to capture growth opportunities (e.g., Zhang 2007; Wu et al. 2010).

These firms exhibit high accruals without implying aggressive earnings management (bad

news hoarding). Consequently, these firms have lower crash risk than other firms with high

accruals. Put differently, the proportion of the disclosure of “risks in product prices and

materials costs” is helpful to separate between firms with aggressive earnings management

(bad news hoarding) and those with active working capital investments.

The evidence indicates that we can further refine the information content of some market risk

disclosures by conditioning on the firms’ accruals. While not all narrative market risk disclosures

are informative, some disclosures in Item 7A, especially those of the “commodity price risk and

derivatives” and “risks in product prices and materials costs,” clearly convey non-boilerplate

information to the public.

BACKGROUNDS AND RELATED LITERATURE

Some Theoretical Backgrounds on Risk Disclosures

Although the market risk disclosure in Item 7A is mandatory, a firm manager has a large

discretion over how much and what sort of information to disclose (e.g., Hodder and McAnally

2001). On the one hand, there are firms that only restate disclosure requirements or merely provide

their generic risk management policies. On the other hand, there are also firms that commit

themselves to provide specific and detailed information about their exposures to various market

risk factors. It is, therefore, worthwhile to review the recent literature on disclosure theories.

In general, investors and regulators prefer transparent disclosures. It is often argued that firms

that consistently make detailed, timely, and informative disclosures should face lower costs of

10

public equity and debt capital (e.g., Easley and O’Hrara 2004; Hughes et al. 2007). Arthur Levitt,

the former chairman of SEC, claims that high-quality accounting standards are desired because

they lower the cost of capital (Levitt 1998). Mazumdar and Sengupta (2005) report that firms with

higher financial reporting quality also pay lower loan spreads on their bank loans, too.

However, transparent disclosure can be a double-edged sword for the shareholders. Suppose

that a firm discloses that its cash flows increase faster due to a decline in the price of a certain rare

metal that is a crucial input for the firm’s production. The disclosure of increased cash flows should

have immediate positive valuation effects, but the information about the declining price of the rare

metal price may reveal proprietary information about the firm’s production. The information may

also encourage the firm’s potential competitors to enter the market, which can increase the rare

metal price and hence affect the firm’s future cash flows negatively. Thus firm managers have to

trade off the benefits of providing transparent disclosures against the proprietary costs of

disclosures that can be harmful to the shareholders’ interests (e.g., Verrecchia 1983, 1990,

2001).5 Gao (2010) also argues that the relationship between the quality of financial disclosures

and the cost of capital may not necessarily be negative when disclosures affect the disclosing firms’

investment decisions.

While traditional theoretical discussions of disclosure decisions are traditionally concerned

with disclosures of accounting items (e.g., earnings, cash flows, sales, asset values, etc.), a few

recent studies discuss benefits and costs of providing quality risk disclosures. For example,

Jorgensen and Kirschenheiter (2003) propose a model of equilibrium strategies for voluntarily

disclosing information about the firms’ risk exposures. In their model, firms with low cash flow

5 Skinner (1994, 1995) point out that expected litigation costs have material effects on managers’ voluntary

disclosure decisions.

11

volatilities choose to disclose their risk exposures voluntarily and truthfully at a cost. These firms

tend to have low costs of capital. Their theory also implies that, when outside investors are more

uncertain about the volatility of the firm’s cash flows, the firm is more likely to disclose its risk

exposures voluntarily. Meanwhile, firms with high cash flow volatilities tend to refrain from

disclosing their risk exposures. These firms provide boilerplate information in mandatory risk

disclosures.

Heinle et al. (2018) recently propose a risk disclosure theory building on Heinle and Smith

(2017). In their model, investors’ perceived uncertainty about the risk exposure of a firm’s cash

flows introduce skewness and excess kurtosis (and all higher moments) in the perceived

distribution of the firm’s future cash flows. Mitigating the uncertainty through risk disclosure may

have positive or negative effects on the firm’ equity value depending on whether the uncertainty

induces positive or negative skewness in the perceived distribution. If the uncertainty induces

positive skewness, the firm would refrain from providing risk disclosures to reduce the uncertainty,

as that would increase the firm’s cost of capital. In this case, the firm should provide boilerplate

information in mandatory risk disclosures. But when the uncertainty induces negative skewness in

the perceived distribution of future cash flows, the firm would curtail crash risk (negative

skewness) by reducing the uncertainty through risk disclosures. This would also lower the firm’s

cost of capital.

To provide a practical perspective on risk disclosure theories, let’s consider an airline

company as an example. Market participants are aware that the company’s cash flows are sensitive

to oil (fuel) price fluctuations. Large increases in oil prices can have large negative effects on the

company’s future cash flows. On the other hand, decreases in oil prices do not necessarily have

large positive effects on future cash flows because the companies are under constant competitive

12

pressures to lower airfares. The volatility of the company’s future cash flows, however, depends

on the extent to which the company hedges its oil price risk. Outside investors are uncertain about

the volatility of the company’s future cash flows, which induces negative skewness in their

perceived distributions. In this case, both Jorgensen and Kirschenheiter’s (2003) theory and Heinle

et al.’s (2018) theory imply that the airline company discloses information about its oil price risk

exposure as well as its hedging activities in order to reduce the investors’ uncertainty surrounding

the volatility of the company’s future cash flows. Such risk disclosure should also help curtail the

downside risk (crash risk) in the company’s equity value.

In practice, most airline companies discuss their fuel price risk exposures, along with their

interest rate risk exposures and foreign currency risk exposures, in Item 7A. However, different

airline companies disclose their fuel price risk exposures and other market risk exposures

differently in Item 7A. For example, in 10-K for the fiscal year 2017, Delta Airlines provides

narrative market risk disclosures with about 470 words in Item 7A, among which the company

discusses its fuel price risk exposures with only 64 words. In contrast, Southwest Airlines’

narrative market risk disclosures in Item 7A are much lengthier with more than 2,740 words. As a

part of the discussion, the company discusses how it uses commodity derivatives to hedge fuel

price risk in detail with more than 460 words. As this simple comparison suggests, some companies

provide more detailed information about their market risk exposures than their industry peers.

Earlier Studies on Stock Price Crash Risk

Since an influential paper by Chen et al. (2001), a growing body of the literature examines the

determinants of stock price crash risk. Recent studies emphasize the role of “bad news hoarding”

by firm managers in driving stock price crash risk. In the presence of large information asymmetry,

firm managers have incentives to withhold bad news from the public while releasing good news

13

truthfully in a timely manner (Shin 2003; Jin and Myers 2006; Kothari et al. 2009; Hutton et al.

2009). This tends to result in a stockpile of bad news. When bad news accumulates and passes a

threshold, managers are unable to withhold bad news and release it to the market suddenly at once,

leading to a sharp decline in stock price, that is, a stock price crash.

Chen et al. (2001) emphasize that increases in stock trading volume predict future crash risk

as they capture differences in opinions and many bearish investors tend to get sidelined in the

presence of short-sale constraints. Chen et al. (2001) indeed find that an increase in trading volume

relative to trend over the prior six months is positively related to stock price crash risk, consistent

with an argument that differences of opinion can lead to stock price overvaluation and subsequent

crash in the presence of short-sale constraints. While emphasizing the role of investor

heterogeneity and trading volume in predicting the crash, Chen et al. (2001) suggest that managers

tend to disclose good news right away, while dribbling bad news out slowly.

Managers of opaque firms have more scope for hiding bad news from the market. Empirical

evidence indeed indicates that opaque firms are likely to have higher crash risk. Firms with high

accruals tend to have higher future crash risk than those with low accruals (e.g., Jin and Myers

2006; Hutton et al. 2009), and firms with more conservative accounting practices tend to have

lower crash risk than others (Kim and Zhang 2016). Kim et al. (2014) show that firms that release

high quality corporate social responsibility (CSR) disclosures voluntarily to the market tend to

exhibit lower stock price crash risk than others.

Callen and Fang (2015a) argue that short interests predict higher stock price crash risk because

short sellers tend to detect bad news hoarding by the managers. Chang et al. (2017) find that high

stock liquidity (as measured by a low effective spread) induces short-term pressure and increases

managers’ ex ante incentives to withhold bad news. As high liquidity also facilitates the exit of

14

transient institutions, high liquidity can magnify the ex post stock price reactions to bad news

releases, thereby increasing crash risk. Callen and Fang (2013) also find that firms in industries

with high litigation-risk tend to have low crash risk.

Another strand of the literature focuses on the asymmetric information content between

positive news and negative news in the presence of bad news hoarding behavior of firm managers.

Shin (2003) shows that, when a manager has the discretion to disclose or withhold news that is

unverifiable, full disclosure is not supported in equilibrium, but a strategy that discloses all

observed good news and withholds all bad news is, providing a rationale for bad news hoarding

behavior. In his model, risk-averse investors require higher discount rates as they observe more

negative information, leading to large reactions of stock prices to bad news than to positive news,

thereby increasing the conditional negative return skewness (crash risk).6

Prior Applications of Textual Analyses to Risk Disclosures

An increasing number of studies apply language processing algorithms to analyze textual

information in firms’ risk disclosures. 7 For example, Campbell et al. (2014) measure the

informativeness of a firm’s risk factor disclosure by counting the number of risk-related keywords

in Item 1A of Form 10-K. Kravet and Musulu (2013) similarly measure the degree of risk

exposures by counting the number of sentences with at least one predefined risk-related keywords,

including {can/cannot, could, may, might, risk*, uncertain*, likely to, subject to, potential*,

6 Goto et al. (2009) show that non-US firms can attenuate the negative relation between cash-flow news and

discount rates when they cross-list their shares in the US markets with more stringent disclosure requirements than their own local markets. By cross-listing in markets with more stringent disclosures, foreign firms can commit themselves to provide more truthful disclosures

7 An increasing number of studies have applied textual analysis to parse 10-K disclosures. Please see Hadlock and Pierce (2010), Loughran and McDonald (2011), Bodnaruk et al. (2015), among others, for studies to gauge financial constraints from textual information in 10-K filings. Textual analysis has also been widely used in analyzing the stock market’s reactions to public news since the pioneering studies by Tetlock (2007) and Tetlock et al. (2008). Please see a recent article by Heston and Sinha (2017) and references cited therein.

15

vary*/varies, depend*, expos*, fluctuat*, possibl*, susceptible, affect, influenc*, and hedg*}.

Hope et al. (2016) employ the Stanford Named Entity Recognition (NER) tool to extract

specific entity names in Item 1A of Form 10-K. These specific entity names include names of

persons, locations, organizations, quantitative numbers in percentages, money values in dollars,

times and dates. They then define the “specificity’’ of risk disclosures by counting the number of

these specific entity names scaled by the total number of words in that section. They show that the

stock market reacts more to the 10-K filings with higher “specificity.” Similarly, analysts’

reliability in assessing the firms’ fundamental risk tends to be higher for firms with more specific

risk disclosures.

Instead of analyzing texts, Loughran and McDonald (2014) propose to use the “file size” of

10-K to proxy the readability of financial report. They argue that financial report users can

effectively obtain more information from more readable financial reports than from less readable

ones. Ertugrul et al. (2017) find that the readability measured by 10-K file size is negatively related

to future stock price crash risk.

Textual information in narrative disclosures is highly unstructured. Same risk-related

keywords may be discussed in completely different contexts by different companies. We thus

employ an unsupervised topic model that does not prespecify any keywords or classifications

before parsing textual documents. In particular, we employ an LDA model (Blei et al. 2003) to

tease out common latent topics in the textual information of narrative market risk disclosures in

Item 7A. A few recent studies, including Dyer et al. (2017) and Brown et al. (2018), use LDA

though their objectives and questions are very different from ours.

The traditional LDA assumes that the order of words appeared in a document does not matter,

and it also does not consider the sentence boundaries. Bao and Datta (2014) argue that this

16

assumption is not reasonable for risk disclosures because each sentence is unlikely to cover

multiple risks. In addition, they assume words in the same sentence belong to the same topic. Bao

and Datta (2014) demonstrate the applicability of their sentence-based LDA model by extracting

30 latent topics from textual information in Item 1A.8 Item 1A discusses a wide range of risk-

related topics including environmental risk, litigation risk, regulation risk, supply chain disruption

risk, and so on. Due to the high dimensionality, many of the extracted latent topics do not yield

straightforward interpretations, or they allow for multiple interpretations. This study employs the

sentence-based LDA model of Bao and Datta (2014) by focusing on eight latent topics in Item 7A.

By focusing on eight topics, we are able to provide intuitive interpretations of the latent topics, all

of which are related to changes in market prices or rates and derivatives.

RESEARCH DESIGN

Data and Sample

We source our data from Form 10-Ks from the SEC EDGAR database, CRSP, and Compustat.

We focus on the CRSP-Compustat-EDGAR merged sample of firm-years but exclude firms

operating in the financial industry (SIC codes 6000-6999) and the public utility industry (SIC

codes 4000-4999). We delete observations if any dependent variables or control variables used in

our regressions are missing. Thus, our final sample of market risk disclosures includes 45,417

firm-year observations for the period from 2002 to 2016.9

8 As a clustering algorithm, LDA is similar to K-means clustering algorithm in that we need to specify the number

of topics in advance. This study chooses eight to start with but it would be interesting to examine how the number of topics affect the analysis. We leave this analysis for future follow-ups.

9 Although Item 7A initiated in 1997, Item 7A contained very limited information before 2002.

17

Extracting Latent Topics in Item 7A

In this paper, we characterize market risk exposures as the topics disclosed in Item 7A and

utilize Bao and Datta’s (2014) sentence-based LDA model to extract these topics. The traditional

LDA model is a three-level hierarchical Bayesian model, which considers text documents as

mixtures of topics composed of individual words with certain probabilities (Blei et al. 2003). In

other words, each topic is a distribution of words, and each document is a mixture of corpus-wide

topics. This methodology has recently been used in several accounting studies (e.g., Brown et al.

2018; Dyer et al. 2017). This statistical model allows users to cluster keywords with their

probability and capture corresponding market risk exposures expressed in a descriptive format.

Thus, the key inferential problem for the sentence-based LDA model is “computing the

posterior distribution of the hidden variables θ (topic proportions) and z (topic assignments) given

the model parameters and the observed documents w”:

( ) ( )( )

, , | ,, | , ,

p z wp z w

p wθ α β

θ α β = (1)

In our study, we first program in Python to read all Item 7A disclosures in our sample and

analyze word distribution from the pool of all Item 7A disclosures. Second, we use Bao and Datta’s

(2014) LDA model to analyze the words from the pooled text to identify eight topics that the words

fit in. Our estimation of parameters and latent variables follow the Bayesian estimation method

described in Heinrich (2008) and Bao and Datta (2014). Third, we assign each sentence in every

Item 7A to a topic and summarize most frequently used words in each topic. We then predict

whether each topic appears and, if so, its proportion, in a specific disclosure based on the words

mixture of this disclosure. Since the LDA model considers each disclosure is a mixture of topics,

we also calculate the proportion of each topic in an Item 7A. Panel A of Table 1 exhibits keywords

18

for each topic extracted by the LDA model.

As Panel A of Table 1 shows, different topics may share some keywords. To summarize the

risk exposures associated with each topic, we read multiple documents that have very high

proportion for one specific topic (greater than 95%) and very low proportion for other topics (less

than 5%). Employing these documents allow us to largely avoid the noise from topics that share

some keywords so that we can tease out the meaning behind each list of keywords. We report our

interpretation of topics in Panel B of Table 1.

For regression analysis, we create dummy variables for each topic (topic assignment, z). The

dummy variable is set to equal to one if the firm’s Item 7A covers a particular topic, zero otherwise.

Thus, we use A_TOPIC1 through A_TOPIC8 to refer to these dummy variables. For regression

analysis, we also employ the proportion of each topic (θ) in Item 7A (P_TOPIC1 through

P_TOPIC8). Each P_TOPIC ranges between 0 and 1.

<Insert Table 1 Here>

Measurement of Stock Price Crash Risk

We estimate annual stock price crash risk for each firm using two widely-used firm-specific

measures and employ one-year-ahead (fiscal year T+1) crash risk as dependent variables in our

regressions. Our first measure is the negative coefficient of skewness of firm-specific daily returns

(NCSKEW). Following prior literature (Chen et al. 2001; Jin and Myers 2006; Hutton et al. 2009;

Callen and Fang 2013), we first estimate the following expanded market and industry index model

to compute firm-specific daily returns:

, 1, , 1 2, , 1 3, , 4, , 5, , 1 6, , 1 ,i t i i m t i j t i m t i j t i m t i j t i tr r r r r r rα β β β β β β− − + += + + + + + + + ε (2)

where ri,t is the stock return on firm i for day t, rm,t is the market return for day t that is measured

19

by the return on the CRSP value-weighted market index, and rj,t is the industry return for day t.

We computed value-weighted returns for each industry based on the first two digits of SIC codes.

Next, the residual return from Eq. (1) is used to estimate the firm-specific daily return, Ri,t,

which is defined as the natural logarithm of one plus the residual return. We compute NCSKEW

for firm i over fiscal year T by taking the negative of the third moment of firm-specific daily returns

and adjusting it by the standard deviation of firm-specific daily returns raised to the third power:

( ) ( )( )( )33

3 2 22, , ,1 / 1 2i t i t i tNCSKEW n n R n n R

= − − − − ∑ (3)

where n is the number of daily returns in fiscal year T. An increase in NCSKEW represents a more

left-skewed distribution of stock returns, which indicates a higher probability of a price crash.

Our second measure is down-to-up volatility, DUVOL, which is defined as the natural

logarithm of the ratio of the standard deviation in the down days to the standard deviation in the

up days. More specifically, we calculate DUVOL as follows:

( ) ( )2 2, , ,log 1 / 1i t u i t d i t

Down UpDUVOL n R n R

= − −

∑ ∑ (4)

where nu is the number of up days over the fiscal year T and nd is the number of down days over

the fiscal year T. Similar to NCSKEW, a higher DUVOL corresponds to higher stock price crash

risk because it indicates a more left-skewed distribution of stock returns. Callen and Fang (2013)

suggest that this measure is less likely to be affected by a small number of extreme returns.

Control Variables

Following prior literature on stock price crash risk (e.g., Chen et al. 2001; Jin and Myers 2006;

Hutton et al. 2009; Callen and Fang 2013), we employ the following control variables in our

20

regression analyses: stock return kurtosis (KUR), stock return volatility (SIGMA), cumulative

returns(CUM_RET), book-to-market ratio (BM), leverage ratio (LEV), return on equity (ROE),

firm size (LNSIZE), stock turnover (DTURNOVER), and working-capital accruals (ACCRUALS).

KUR is defined as the kurtosis of firm-specific daily returns over the fiscal year. SIGMA is

defined as the standard deviation of firm-specific daily returns over the fiscal year. CUM_RET is

the cumulative firm-specific daily returns over the fiscal year. We follow Fama and French (2008)

to calculate BM as the natural logarithm of the ratio of the book value of equity to the market value

of equity. LEV is the book value of all liabilities divided by total assets at the end of the fiscal year.

ROE is the income before extraordinary items divided by the book value of equity at the end of

the fiscal year. LNSIZE is defined as the natural logarithm of the market value of equity at the end

of the fiscal year. DTUROVER is the difference between the average monthly share turnover over

the fiscal year and that over the previous fiscal year. ACCRUALS is defined as the change in

accounts receivable plus the change in inventory minus the change in accounts payable minus the

change in taxes payable plus the change in other assets (Frankel and Sun 2018).

Since stock return characteristics such as crash risk exhibit persistence over time, we include

the lagged dependent variable (one of the two crash risk proxies) as a control variable. This and

other control variables are measured one year (fiscal year T) before the dependent variables. We

control for Fama-French 48-industry membership dummies and year fixed effects. We winsorize

each continuous variable at 1% and 99% of its distribution. Standard errors are clustered at the

firm level.

21

EMPIRICAL RESULTS

Descriptive Statistics

Table 2 presents descriptive statistics for the variables used in our study. The mean (median)

values NCSKEW and DUVOL are -0.021 (-0.125) and 0.011 (-0.007), respectively. These values

are close to numbers reported in previous studies. Figure 1 reports time variations of our two crash

risk measures, NCSKEW and DUVOL, between 2002 to 2016. The cross-sectional mean of

NCSKEW spiked in 2002 and 2008 and fell sharply thereafter. Callen and Fang (2017) suggest that

the first spike and fall in 2002-2003 correspond to the effects of the Sarbanes-Oxley Act that

attenuated the withholding of bad news. The second spike and fall in 2008-2009 correspond to the

global financial crisis. The time variation of DUVOL has been much less volatile than that of

NCSKEW.

< Insert Figure 1 Here >

Using Bao and Datta’s (2014) implementation of LDA in a sample of US firms between 2002

and 2016, we uncover eight latent topics in Item 7A. A firm’s disclosure of each topic is

characterized by two latent variables: (i) topic assignment and (ii) topic proportion. If a firm-year

discloses a particular topic, the topic assignment of the firm-year takes a value of 1 for the topic

and 0 otherwise. The topic proportion is the estimated proportion which each firm-year devotes to

disclose a particular topic in Item 7A. For example, if a firm-year devotes 50% of total sentences

in discussing a particular topic, the firm-year takes a value of 0.5 for that topic.

To run LDA, we require each text document (Item 7A) to contain more than 100 words.

However, the majority of firms have very short narrative market risk disclosures under Item 7A

with only 100 words or less. While we are able to apply LDA to obtain topic allocations and topic

22

proportions for 22,116 firm-years (48.7% of the sample), the other 23,301 firm-years (51.3% of

the sample) have no topic allocations or topic proportions. There are a few reasons for very short

narrative disclosures in Item 7A. Some of the 23,301 firm-years may not have any material market

risk exposures to report in Item 7A. Others may choose to provide as little information as possible

in mandatory market risk exposures in Item 7A. There are also firms that discuss their market risk

disclosures in other sections of 10-K filings and keep the Item 7A minimal. Distinguishing among

these possibilities is interesting but difficult, and we leave it for future work. However, since most

of the firm-years with very short Item 7A provide examples of uninformative or unimportant

market risk disclosures, they should provide useful information to our analysis. We, therefore,

retain these 23,301 firm-years without topic assignments or topic proportions by replacing their

missing values with zero. We also include a dummy variable to indicate these firms.10 As it turns

out, excluding 23,301 firm-years from the analysis does not change our conclusions we discuss

below.

Among the eight latent topics extracted in our LDA procedure, Topic 4 and Topic 1 are the

market risk exposures that are most commonly discussed in Item 7A. Around 43.9% and 43.0%

of our firm-years discuss Topic 4 and Topic 1, respectively, in Item 7A of their 10-K filings (Table

2, Panel A). These correspond to about 90% of the 22,116 firm-years that provide Item 7As with

more than 100 words in their narrative market risk disclosures. Disclosures of these two topics

also have larger proportions than those of other topics in Item 7A. Furthermore, proportions of

10 The situation here is somewhat similar to a case in which we examine the information content of R&D

expenses. As Koh and Reeb (2015) report, some firms leave R&D expenses blank in their financial statements even when they engage in patent activities actively. Koh and Reeb (2015) suggest that, when true R&D in blank firms tends to be small, replacing missing R&Ds with zero and adding a dummy variable to indicate blank R&D firms would be more appropriate than excluding the blank R&D firms altogether. In a different context, Pontiff and Woodgate (2008), McLean, Pontiff, and Watanabe (2009) replace negative book values with zero and includes a dummy variable to indicate firms with negative book equity values, rather than excluding them from the sample.

23

Topic 1 have high correlations with those of other topics, except Topics 2 and 7 (Table 2, Panel

B). Proportions of Topic 4 also have high correlations with those of other topics, except Topic 2.

Thus Topics 1 and 4 appear to capture commonly discussed topics in Item 7A.

In contrast, Topics 2 and 7 are the least commonly discussed topics of market risk exposures.

Around 28.3% and 29.2% of our sample-years discuss these topics (Table 2, Panel A). These

correspond to about 60% of the firms-years that provide Item 7As with more than 100 words.

Proportions of these topics have low correlations with those of other topics. As we described

above, these topics are associated with exposures that affect the firms’ output prices and/or input

costs. Topic 2 is related to “commodity price risk and derivatives,” and Topic 7 is associated with

“risks in product prices and materials costs” that have significant effects on the firms’ future sales

and operating costs. Disclosures of these topics convey information about the volatility of future

cash flows than disclosures of other topics, and their proportions may exhibit larger variations

across firms.

Not surprisingly, the two measures of crash risk, NCSKEW and DUVOL, are highly correlated

with each other with a correlation coefficient of 0.61 in Table 2, Panel B. In Panel B, we can also

see that the proportion of each topic tends to increase with market capitalization (LNSIZE) and

decrease with the book-to-market ratio (BM). That is, large growth firms tend to spend more than

100 words in Item 7A. These characteristics are also associated with future NCSKEW (one of the

crash risk measures), though the correlation coefficients are not very large. Correlations between

our crash risk measures (NCSKEW, DUVOL) and our control variables are consistent with the prior

literature.

< Insert Table 2 Here >

Table 3 reports relations between the disclosure of each topic and our control variables that

24

are known to predict future crash risk. Panel A reports Probit regressions of topic assignments and

Panel B reports OLS regressions of topic proportions. After controlling for industry effects and

year effects, the disclosure (assignment) of Topic 2, “commodity price risk and derivatives,” is

positively associated with return on equity (ROE) and firm size (LNSIZE), and negatively

associated with the stock return volatility (SIGMA), cumulative returns (CUM_RET), book-to-

market ratio (BM), financial leverage (LEV), and accruals (ACCRUALS). These control variables,

except accruals (ACCRUALS), also predict the disclosure of Topic 7, "risks in product prices and

materials costs." Firms with low financial leverage (LEV) tend to disclose Topic 5, while firms

with high financial leverage tend to disclose Topic 6 in Item 7A.


Market Risk Disclosures and Future Crash Risk

We examine the impact of market risk exposures on stock price crash risk by estimating the

following baseline regression model:

, 0 , , ,

j Ki t j i T K i T i t

KCRASH TOPIC CONTROLS YearDummies IndustryDummiesβ β ε= + + ϒ + + +∑ (5)

for each topic j=1,…,8, where CRASHi,T+1 represents our two crash risk measures, NCSKEW and

DUVOL. TOPICj refers to each of our eight topics of market risk exposures (either assignments or

proportions) extracted from Item 7A of 10-K forms. We focus on j because it represents the effect

of each topical disclosure (assignment or proportion) on firm-specific crash risk. CONTROLSK

summarizes the set of control variables.

Table 4 reports the regression results of Eq. (4). We first examine the predictive relation

between disclosures of eight latent topics and future crash risk without controlling for other

predictors of future crash risk. In Table 4, we only include lagged dependent variable and dummies

25

in the indicator of missing latent topics (MISS) as the control variable in addition to year- and

industry-fixed effects. In all regressions, coefficients on MISS are negative and significant,

consistent with the notion that firms with very short narrative market risk disclosures (in Item 7A)

do not have material risk exposures to report.

In Panel A, we use NCSKEW as the dependent variable and examine the coefficient on each

topic assignment. We find that the coefficients on Topic 2 assignment (-0.048; t=-2.16) and Topic

7 assignment (-0.069; t=-3.20) are negative and statistically significant at the 5% and 1% levels,

respectively. Coefficients on Topic 3 and Topic 5 assignments are positive and moderately

significant at the 10% level. In Panel B, we repeat the analysis with DUVOL as the dependent

variable instead of NCSKEW. Coefficients on Topic 2 and Topic 7 assignments are negative and

significant at the 5% and 10% level, respectively. The coefficient on Topic 6 is positive and

moderately significant at the 10% level. Putting together, Topic 2 and Topic 7 assignments are

negatively correlated with the two measures of future crash risk (NCSKEW and DUVOL).

In Panel C and Panel D, we repeat the same exercise using Topic proportions rather than Topic

assignments. In regressions of NCSKEW (Panel C), only the coefficient on Topic 7 proportion is

reliably different from zero. It is negative (-0.239; t=-3.42) and significant at the 1% level. In

regressions of DUVOL (Panel D), coefficients of Topic 2 and Topic 7 proportions are negative and

significant at the 10% and 5% level.

These regression results suggest that disclosures of Topic 2 (commodity price risk and

derivatives) and Topic 7 (risks in product prices and materials costs) are negatively and

significantly correlated with future crash risk. In contrast, disclosures of the other topics do not

convey significant or consistent information about future crash risk. These topics are apparently

related to disclosures of interest rate and foreign currency exchange rate exposures, though they

26

are discussed in different contexts.11


To take a deeper look at the information content of these two topical risk disclosures (Topics

2 and 7), we control for other variables that are known to predict future crash risk (please see

Section 3.4) and tabulate regression results in Table 5. In general, higher excess kurtosis (KUR)

and recent specific stock returns (CUM_RET) are positively associated with future crash risk,

measured by either NCSKEW or DUVOL. Recent volatility of specific stock returns (SIGMA), and

book-to-market ratio (BM) are negatively associated with future crash risk. Firms with large size

(LNSIZE), large increases in trading volume (DTURNOVER), and large accruals (ACCRUALS) are

positively and significantly correlated with future NCSKEW, but not with future DUVOL. These

results are consistent with the findings of Chen et al. (2001), Hutton et al. (2009), Callen and Fang

(2013), and Zhang (2013), among others.

After controlling for these predictors of future crash risk, disclosures of Topic 2 (commodity

price risk and derivatives) remain significant in predicting future crash risk negatively (Table 5),

no matter whether we use Topic 2 assignment or Topic 2 proportion. Coefficients disclosures of

Topic 7 (risks in product prices and materials costs) are negative but only moderately significant

(at the 10% level) in the regressions of future NCSKEW but not in those of future DUVOL. This

exercise shows that disclosures of Topic 2 (commodity price risk and derivatives) provide

significant and incremental information about the firms’ future crash risk. The information

revealed by disclosures of Topic 7 (risks in product prices and materials costs), on the other hand,

appears to be largely reflected in recent characteristics of stock returns (KUR, SIGMA, CUM_RET)

11 For example, Topics 3, 4, and 8 share the following keywords in their vocabularies, {foreign, currency,

exchange}, suggesting that they are related to foreign currency exposures. While these three topics are related to foreign currency exposures, they capture different aspects of foreign currency exposures.

27

and the book-to-market ratio (BM).12


Grouping by Accruals

As we discussed above, the recent literature emphasizes the role of the bad news hoarding

behavior of entrenched as a major source stock price crash risk. Consistent with the effects of bad

news hoarding behavior, the literature has shown that opaque financial reporting is positively

associated with crash risk (Jin and Myers 2006; Hutton et al. 2009) while accounting conservatism

is negatively associated with crash risk (Kim and Zhang 2016).

If the bad-news-hoarding behavior of entrenched managers drives crash risk and if disclosure

of the “commodity price risk and derivatives” mitigates the managers’ bad news hoarding

behavior, the risk disclosure should help mitigate the crash risk more among firms with aggressive

earnings management (large accruals). To see if/how the degree of earnings opacity affects the

information content of narrative market risk disclosures. We group the sample into two subsamples

sorted by the degree of earnings management, as measured by working capital accruals (Frankel

and Sun 2018). Specifically, the top-half subsample of accruals consists of firm-years with accruals

above the industry median in each year, and the bottom-half subsample consists of the rest.

We summarize our subsample analyses for Topic 2 and Topic 7 in Table 6 and Table 7,

respectively. Interestingly, ROE tends to be positively associated with future crash risk among the

subsample of firms with high accruals, consistent with a view that these firms’ earnings tend to be

12 One of the topics (Topic 6), concerned with the “interest rate risk of debts and swaps,” is positively associated

with future crash risk when we control for known predictors of future crash risk. However, the topic does not show a significant correlation with future crash risk when we do not include the control variables. We leave it for future research to investigate further if Topic 6 provides real incremental information about future crash risk.

28

inflated.

Our evidence indicates that disclosures of “commodity price risk and derivatives” are more

negatively correlated with future stock price crash risk when they are made by firm managers who

are less aggressive in earnings management than others. The significantly negative relation

between the discussion of Topic 2 (topic assignment), “commodity price risk and derivatives,” is

more pronounced among firms with lower accruals. For example, the coefficients on Topic 2

assignment are -0.098 (t=-3.15) for NCSKEW and -0.03 (t=-3.21) for DUVOL after controlling for

the known predictors of crash risk, among the bottom half subsample of accruals, i.e., among the

firms that manage earnings less aggressively than others. These coefficients are not reliably

different from zero among the top half subsample of accruals, indicating that the disclosures of

Topic 2 do not provide incremental information about future crash risk among the firms with high

accruals (aggressive earnings management). However, we do not find a discernible difference in

the coefficients of Topic 7 (assignment) between the top-half and bottom-half subsamples of

accruals.


Disclosures of the Topic 2, “commodity price risk and derivatives,” appear to help reduce the

firms’ future crash risk when they are made by firm managers who are not engaged in aggressive

earnings management. The evidence is consistent with an interpretation that disclosures of the

“commodity price risk and derivatives” can help lower future crash risk not because they mitigate

the bad news hoarding behavior of entrenched managers, but because they reduce the uncertainty

surrounding the firms’ cash flow volatilities, as argued by Heinle and Smith (2017) and Heinle et

al. (2018).


29

Put differently, our LDA textual analysis suggests the uncertainty surrounding future cash

flow volatilities as an important determinant of future crash risk that is different from the effects

of the bad news hoarding behavior of entrenched managers on which many recent studies have

focused. When investors are uncertain about the risk exposures of a firm’s future cash flows,

providing information about the risk exposures may help lower the firm’s stock price crash risk.

This theory, put forth by Heinle and Smith (2017) and Heinle et al. (2018), appear to provide a

consistent explanation for the useful information provided by some topics discussed in market risk

disclosures (Item 7A), such as the commodity price risk and derivatives and risks in product

prices and materials costs.

Discussions

Taking on a question if the textual information in a firm's narrative market risk disclosure

provides useful information about the firm’s future crash risk, we are able to uncover two latent

topics that exhibit significant predictability of future crash risk. The two topics, “commodity price

risk and derivatives” and “risks in product prices and materials costs” are related to risk

exposures in the firms’ input costs and output prices. In particular, disclosures of the former convey

significant incremental information about the firms’ future crash risk even after we control for

other known predictors of crash risk. The evidence provides a strong case against the widely-held

perceptions that the mandatory market risk disclosure provides only boilerplate information.

However, this study is also subject to a few limitations and shortcomings.

First, one of the major limitations of this study is the difficulty of identifications. While LDA

is useful to tease out a small number of latent topics in narrative market risk disclosures, it is not

clear a priori how many topics we should cluster the textual information into. Furthermore, while

30

the latent topics uncovered by LDA are quite intuitive, they still allow for multiple interpretations.

They only provide proxies for certain risk exposures.

Second, our empirical implementation of LDA focuses on the market risk disclosure in Item

7A. This reflects our objective of addressing the ongoing discussion about the informativeness of

Item 7A. Meanwhile, we are aware that a few recent studies focus on Item 1A that discusses much

broader spectrum of risk exposures each firm has (Sribunnak and Wong 2006; Kravet and Muslu

2013; Campbell et al. 2014; Hope et al. 2016), since the SEC made Item 1A mandatory in 2005.

Bao and Datta (2014) demonstrate the applicability of the sentence-based LDA model (which we

rely on) by teasing out 30 latent topics from Item 1A, though many of these topics do not yield

straightforward interpretations. It would be interesting to extend our analysis to include additional

risk types (e.g., environmental risk, litigation risk, regulation risk, supply chain disruptions, etc.)

that may help predict the firms’ crash risk. We leave this topic for future research.

Third, we conduct our LDA analysis in a pooled sample of firm-years between 2002 and 2016.

The analysis is inherently an in-sample analysis as is the case with most panel data analysis.

Although we discuss correlations between disclosures in year t and crash risk in year t+1, these

correlations do not necessarily imply out-of-sample predictability. Our analysis is confined to in-

sample predictability. This is because the topic in year t is extracted from the entire pooled sample

of firm-years. We do not view this as a serious concern as most panel regressions used in many

academic studies share the same limitation. However, investors who are interested in

implementing our analysis in real-time should consider running LDA recursively with rolling or

expanding windows.

31

Conclusion

This study provides a new angle to the regulatory debate on the informativeness of narrative

market risk disclosures in Item 7A of Form 10-K filings. While many market participants would

agree on the importance of improving the quality of market risk disclosures, there are concerns

about their compatibility (e.g., Hodder and McAnally 2001). There are also widely-held

perceptions that many market risk disclosures provide only boilerplate information (e.g., Abraham

and Shrieves 2014; CFA Institute 2016; SEC 2016).

To shed light on the ongoing discussion, we employ a language processing algorithm based

on LDA (Blei et al. 2003; Bao and Datta 2014), to cluster the textual information in Item 7A into

eight latent topics. We then examine how the disclosure of each topic (topic assignment or topic

proportion) is related to the firm’s future stock price crash risk.

The majority of the latent topics extracted from Item 7A via LDA do not provide significant

information about future crash risk. However, this study still uncovers two latent topics that are

negatively and significantly correlated with future stock price crash risk. These two latent topics

are associated with exposures of the firms’ cash flows to risks in their input costs or/and output

prices. The first topic is about the “commodity price risk and derivatives” and the second topic is

about the “risks in product prices and materials costs.” In particular, the former topic is correlated

with future crash risk even when we control for other predictors of crash risk. This suggests that

the disclosure of “commodity price risk and derivatives,” conveys significant and incremental

information about the firms’ future crash risk.

A further analysis suggests that the negative correlation between the disclosure (topic

assignment) of the “commodity price risk and derivatives” and future crash risk is concentrated

32

among low-accrual firms than among high-accrual firms. We also find that, among high-accrual

firms, firms that spend a large proportion of Item 7A in disclosing “risks in product prices and

materials costs” have a significantly lower crash risk. By conditioning on accruals, we can

enhance the information content of the disclosures of “commodity price risk and derivatives” and

“risks in product prices and materials costs” substantially.

Despite a few limitations and shortcomings, evidence in the paper provides a strong case

against the commonly-held perceptions that Item 7A provides only boilerplate information. In

particular, this study shows that some firms’ narrative market risk disclosures, especially those

related to the “commodity price risk and derivatives” and “risks in product prices and materials

costs,” convey relevant information that is negatively and significantly correlated with future stock

price crash risk. Clearly, they convey useful, non-redundant, and non-boilerplate information to

the public, which should help investors and analysists improve the process of security analysis and

valuation, stock selection, and portfolio risk management.

33

Reference

Abraham, Santhosh, and Philip J Shrives. 2014. “Improving the Relevance of Risk Factor Disclosure in Corporate Annual Reports.” The British Accounting Review, vol. 46, no. 1: 91–107.

Bao, Yang, and Anindya Datta. 2014. “Simultaneously Discovering and Quantifying Risk Types from Textual Risk Disclosures.” Management Science, vol. 60, no.6: 1371–91.

Blei, David M., Andrew Y. Ng, and Michael I. Jordan. 2003. “Latent Dirichlet Allocation.” Journal of Machine Learning Research, vol. 3 (Jan): 993–1022.

Bodnaruk, Andriy, Tim Loughran, and Bill McDonald. 2015. “Using 10-K Text to Gauge Financial Constraints.” Journal of Financial and Quantitative Analysis, vol. 50, no. 4: 623–646.

Brown, Nerissa C., Richard Crowley., and W. Brooke Elliott 2018. “What are you saying? Using topic to detect financial misreporting.” Working Paper.

Callen, Jeffrey L, and Xiaohua Fang. 2013. “Institutional Investor Stability and Crash Risk: Monitoring versus Short-Termism?” Journal of Banking & Finance, vol. 37, no. 8: 3047–63.

———. 2015a. “Religion and Stock Price Crash Risk.” Journal of Financial and Quantitative Analysis, vol. 50, no. 1–2: 169–95.

———. 2015b. “Short Interest and Stock Price Crash Risk.” Journal of Banking & Finance, vol. 60: 181–94.

———. 2017. “Crash Risk and the Auditor–client Relationship.” Contemporary Accounting Research, vol. 34, no. 3: 1715–50.

Campbell, John L, Hsinchun Chen, Dan S Dhaliwal, Hsin-min Lu, and Logan B Steele. 2014. “The Information Content of Mandatory Risk Factor Disclosures in Corporate Filings.” Review of Accounting Studies, vol. 19, no.1: 396–455.

CFA Institute. 2016. User Perspective on Financial Instrument Risk Disclosures under International Financial Reporting Standards, Volume 1. Position Paper prepared by Vincent T. Papa, May 2016.

Chang, Xin, Yangyang Chen, and Leon Zolotoy. 2017. “Stock Liquidity and Stock Price Crash Risk.” Journal of Financial and Quantitative Analysis, vol. 52, no. 4: 1605–37.

Chen, Joseph, Harrison Hong, and Jeremy C Stein. 2001. “Forecasting Crashes: Trading Volume, Past Returns, and Conditional Skewness in Stock Prices.” Journal of Financial Economics, vol. 61, no.3: 345–81.

34

Dyer, Travis, Mark Lang, and Lorien Stice-Lawrence. 2017. “The Evolution of 10-K Textual Disclosure: Evidence from Latent Dirichlet Allocation.” Journal of Accounting and Economics, vol. 64, no. 2–3: 221–45.

Easley, David, and Maureen O'Hara. 2004. “Information and the Cost of Capital.” The Journal of Finance, vol. 59, no. 4: 1553–83.

Ertugrul, Mine, Jin Lei, Jiaping Qiu, and Chi Wan. 2017. “Annual Report Readability, Tone Ambiguity, and the Cost of Borrowing.” Journal of Financial and Quantitative Analysis, vol. 52, no. 2: 811–36.

Fama, Eugene F, and Kenneth R French. 2008. “Dissecting Anomalies.” The Journal of Finance, vol. 63, no. 4: 1653–78.

Frankel, Richard, and Yan Sun. 2018. “Predicting Accruals Based on Cash-Flow Properties.” The Accounting Review, vol. 93, no. 5: 165–86.

Gao, Pingyang. 2010. “Disclosure Quality, Cost of Capital, and Investor Welfare.” The Accounting Review, vol. 85, no.1: 1–29.

Goto, Shingo, Masahiro Watanabe, and Yan Xu. 2008. “Strategic Disclosure and Stock Returns: Theory and Evidence from US Cross-Listing.” The Review of Financial Studies, vol. 22, no. 4: 1585–1620.

Hadlock, Charles J. and Joshua R. Pierce. 2010. “New Evidence on Measuring Financial Constraints: Moving Beyond the KZ Index.” The Review of Financial Studies, vol. 23, no. 5: 1909–1940.

Heinle, Mirko S, and Kevin C Smith. 2017. “A Theory of Risk Disclosure.” Review of Accounting Studies, vol. 22, no.4: 1459–91.

Heinle, Mirko S, Kevin C Smith, and Robert E Verrecchia. 2018. “Risk-Factor Disclosure and Asset Prices.” The Accounting Review, vol. 93, no. 2: 191–208.

Heinrich, G. 2008. “Parameter Estimation for Text Analysis.” Technical Note. vsonix GmbH + University of Leipzig, Germany.

Heston, Steven L., and Nitish Ranjan Sinha. 2017. “News vs. Sentiment: Predicting Stock Returns from News Stories.” Financial Analysts Journal, vol. 73, no. 3: 67–83.

Hodder, Leslie, and Mary Lea McAnally. 2001. “SEC Market-Risk Disclosures: Enhancing Comparability.” Financial Analysts Journal, vol. 57, no. 2: 62–78.

Hope, Ole Kristian, Danqi Hu, and Hai Lu. 2016. “The Benefits of Specific Risk-Factor Disclosures.” Review of Accounting Studies, vol. 21, no. 4: 1005–1045.

35

Hughes, John S, Jing Liu, and Jun Liu. 2007. “Information Asymmetry, Diversification, and Cost of Capital.” The Accounting Review, vol. 82, no. 3: 705–29.

Hutton, Amy P, Alan J Marcus, and Hassan Tehranian. 2009. “Opaque Financial Reports, R2, and Crash Risk.” Journal of Financial Economics, vol. 94, no. 1: 67–86.

Jin, Li, and Stewart C Myers. 2006. “R2 around the World: New Theory and New Tests.” Journal of Financial Economics, vol. 79, no. 2: 257–92.

Jorgensen, Bjorn N, and Michael T Kirschenheiter. 2003. “Discretionary Risk Disclosures.” The Accounting Review, vol. 78, no. 2: 449–69.

Kawaller, Ira G. 2004. “What Analysts Need to Know about Accounting for Derivatives.” Financial Analysts Journal, vol. 60, no.2: 24–30.

Kim, Jeong-Bon, Yinghua Li, and Liandong Zhang. 2011. “CFOs versus CEOs: Equity Incentives and Crashes.” Journal of Financial Economics, vol. 101, no.3: 713–30.

Kim, Jeong‐Bon, and Liandong Zhang. 2016. “Accounting Conservatism and Stock Price Crash Risk: Firm‐level Evidence.” Contemporary Accounting Research, vol. 33, no.1: 412–41.

Kim, Yongtae, Haidan Li, and Siqi Li. 2014. “Corporate Social Responsibility and Stock Price Crash Risk.” Journal of Banking & Finance, vol. 43: 1–13.

Koh, Ping-Sheng, and David M Reeb. 2015. “Missing R&D.” Journal of Accounting and Economics, vol. 60, no.1: 73–94.

Kothari, Sabino P, Susan Shu, and Peter D Wysocki. 2009. “Do Managers Withhold Bad News?” Journal of Accounting Research, vol. 47, no.1: 241–76.

KPMG. 2017. “IFRS Compared to US GAAP.” https://assets.kpmg.com/content/dam/kpmg/xx/pdf/2017/12/ifrs-us-gaap-2017.pdf

Kravet, Todd, and Volkan Muslu. 2013. “Textual Risk Disclosures and Investors’ Risk Perceptions.” Review of Accounting Studies, vol. 18, no. 4:1088-1122.

Levitt, Arthur. 1998. “The Importance of High Quality Accounting Standards.” Accounting Horizons vol. 12, no.1: 79-82.

Linsmeier, Thomas J, Daniel B Thornton, Mohan Venkatachalam, and Michael Welker. 2002. “The Effect of Mandated Market Risk Disclosures on Trading Volume Sensitivity to Interest Rate, Exchange Rate, and Commodity Price Movements.” The Accounting Review, vol. 77, no. 2: 343–77.

Loughran, Tim, and Bill McDonald. 2011. “When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks.” The Journal of Finance, vol. 66, no.1: 35-65.

36

———. 2014. “Measuring Readability in Financial Disclosures.” The Journal of Finance, vol. 69, no.4: 1643–71.

Mazumdar, Sumon C, and Partha Sengupta. 2005. “Disclosure and the Loan Spread on Private Debt.” Financial Analysts Journal, vol. 61, no.3: 83–95.

McLean, R David, Jeffrey Pontiff, and Akiko Watanabe. 2009. “Share Issuance and Cross-Sectional Returns: International Evidence.” Journal of Financial Economics, vol. 94, no.1: 1–17.

Pontiff, Jeffrey, and Artemiza Woodgate. 2008. “Share Issuance and Cross‐sectional Returns.” The Journal of Finance, vol. 63, no.2: 921–45.

Rajgopal, Shivaram. 1999. “Early Evidence on the Informativeness of the SEC’s Market Risk Disclosures: The Case of Commodity Price Risk Exposure of Oil and Gas Producers.” The Accounting Review, vol. 74, no. 3: 251–80.

Securities and Exchange Commission (SEC). 1997. Disclosure of Accounting Policies for Derivative Financial Instruments, and Derivative Commodity Instruments and Disclosure of Quantitative and Qualitative Information about Market Risk Inherent in Derivative Financial Instruments, Other Financial Instruments, and Derivative Commodity Instruments. Release Nos. 33–7386; 34–38223; IC-22487; FR 48; International Series No.1047; File No. 27-35-95. Washington, D.C.: SEC.

Securities and Exchange Commission (SEC). 2016. SEC Concept Release: Business and Financial Disclosure Required by Regulation S-K. Release Nos. 33-10064; 34-77599; File No. S7-06-16. Washington, DC.: SEC.

Shin, Hyun Song. 2003. “Disclosures and Asset Returns.” Econometrica, vol. 71, no.1: 105–33.

Skinner, Douglas J. 1994. “Why Firms Voluntarily Disclose Bad News.” Journal of Accounting Research, vol. 32, no.1: 38–60.

———. 1995. “Do the SEC’s Safe Harbor Provisions Encourage Forward-Looking Disclosures?” Financial Analysts Journal. vol. 51, no. 4: 38–44.

Sribunnak, Visarut, and M H Franco Wong. 2006. “The Impact of Excluding Nonfinancial Exposure on the Usefulness of Foreign Exchange Sensitivity-Analysis Risk Disclosures.” Journal of Accounting, Auditing & Finance, vol. 21, no. 1: 1–25.

Tetlock, Paul C. 2007. “Giving Content to Investor Sentiment: The Role of Media in the Stock Market, The Journal of Finance, vol. 62, no. 3: 1139-1168.

Tetlock, Paul C, Maytal Saar-Tsechansky, and Sofus Macskassy. 2008, “More than words: Quantifying Language to Measure Firms' Fundamentals.” The Journal of Finance, Vol. 63, no. 3: 1437-1467.

37

Thornton, Daniel B, and Michael Welker. 2004. “The Effect of Oil and Gas Producers’ FRR No. 48 Disclosures on Investors’ Risk Assessments.” Journal of Accounting, Auditing & Finance, vol. 19, no.1: 85–114.

Verrecchia, Robert E. 1983. “Discretionary Disclosure.” Journal of Accounting and Economics, vol 5: 179–94.

———. 1990. “Information Quality and Discretionary Disclosure.” Journal of Accounting and Economics, vol. 12, no. 4: 365–80.

———. 2001. “Essays on Disclosure.” Journal of Accounting and Economics, vol. 32, no. 1–3: 97–180.

Wu, Jin, Lu Zhang, and X Frank Zhang. 2010. “The Q‐theory Approach to Understanding the Accrual Anomaly.” Journal of Accounting Research, vol. 48, no.1: 177–223.

Xiong, James X, Thomas M Idzorek, and Roger G Ibbotson. 2016. “The Economic Value of Forecasting Left-Tail Risk.” Journal of Portfolio Management, vol. 42, no. 3: 114–23.

Zhang, X Frank. 2007. “Accruals, Investment, and the Accrual Anomaly.” The Accounting Review, vol. 82, no. 5: 1333–63.

Zhang, Xiao-Jun. 2013. “Book-to-Market Ratio and Skewness of Stock Returns.” The Accounting Review, vol. 88, no.6: 2213–40.

38

Figure 1. Cross-sectional Means of Crash Risk Measures over the Sample Period 2002-2016

39

Table 1. Latent Topics Extracted from Narrative Market Risk Disclosures (Item 7A)

Panel A. Keywords Extraction

TOPICS Keywords

1 Interest, rates, market, value, fair, changes, increase, debt, risk, change, cash, approximately, million, decrease

2 Value, price(s), fair, market, risk, gas, derivative, natural, financial, contracts, oil, commodity, instruments, credit

3 Foreign, contracts, currency, instruments, derivative, financial, forward, exchange, use, hedge, rate, interest, enter, exposure, trading

4 Foreign, currency, exchange, changes, rate(s), interest, risk, market, financial, results, exposed, fluctuations, exposure, operations

5 Investment(s), cash, securities, market, financial, equivalents, credit, portfolio, rate, money, risk, marketable, debt, million

6 Rate(s), interest, million, debt, credit, outstanding, variable, fixed, swap, amount, borrowings, based, facility, revolving

7 Price(s), products, cost(s), market, raw, sales, certain, future, operating, significant, material(s), subject

8 Foreign, currency, million, exchange, net, denominated, dollar, income, value, currencies, losses, approximately, gains, sales, ended

Panel B. Interpretation of Topics

TOPICS Description

1 Topic 1 discusses whether and how changes in interest rates affect the fair value of firms’ debt instruments and cash flows and tools that firms utilize to manage risks associated with fluctuations in interest rates.

2 Topic 2 discusses firms’ exposure to commodity risk derived from price volatility for production in the oil markets. It also addresses what derivatives are used to hedge these commodity risks and whether the use of derivatives will limit firms’ access to credit facilities.

3 Topic 3 discusses whether firms are exposed to foreign currency exchange and interest risks, what financial instruments are used to hedge these risks, and whether firms enter transactions for trading or speculative purpose.

4 Topic 4 discusses whether adverse changes in financial market prices or rates such as interest and foreign currency exchange risks will cause fluctuations in results of operations.

5

Topic 5 discusses firms’ holding of cash, cash equivalents, and marketable securities. It also addresses the extent to which firms invest excess cash and cash equivalent in money market account that invests in debt securities or short-term financial instruments so that firms can minimize the exposure due to adverse changes in interest rates.

6 Topic 6 discusses how firms’ fixed- and variable- rate debt obligations and interest rate swap are impacted by interest risks and the detailed information regarding interest on borrowings under various credit facilities.

7 Topic 7 discusses firms’ exposure to market risks derived from raw materials and how the fluctuations in the exchange rates of dominant currencies affect the cost of materials, sales, and future operating results.

8 Topic 8 discusses how the foreign currency exchange rates relative to the dominated currencies affect the sales and whether the comprehensive income/losses exposure to realized and unrealized foreign currency gains and losses is significant.

40

Table 2. Descriptive Statistics

Panel A. Summary statistics VARIABLES N MEAN MEDIAN STD MIN P25 P75 MAX

A_TOPIC1 45,417 0.430 0 0.495 0 0 1 1 A_TOPIC2 45,417 0.283 0 0.450 0 0 1 1 A_TOPIC3 45,417 0.415 0 0.493 0 0 1 1 A_TOPIC4 45,417 0.439 0 0.496 0 0 1 1 A_TOPIC5 45,417 0.313 0 0.464 0 0 1 1 A_TOPIC6 45,417 0.367 0 0.482 0 0 1 1 A_TOPIC7 45,417 0.292 0 0.455 0 0 1 1 A_TOPIC8 45,417 0.394 0 0.489 0 0 1 1 P_TOPIC1 45,417 0.083 0 0.143 0 0 0.120 0.960 P_TOPIC2 45,417 0.040 0 0.121 0 0 0.010 0.980 P_TOPIC3 45,417 0.055 0 0.097 0 0 0.080 0.880 P_TOPIC4 45,417 0.084 0 0.137 0 0 0.140 0.940 P_TOPIC5 45,417 0.054 0 0.128 0 0 0.010 0.950 P_TOPIC6 45,417 0.067 0 0.139 0 0 0.060 0.990 P_TOPIC7 45,417 0.036 0 0.105 0 0 0.010 0.990 P_TOPIC8 45,417 0.066 0 0.126 0 0 0.080 0.970 NCSKEW 45,417 -0.021 -0.125 1.413 -3.912 -0.658 0.437 4.513 DUVOL 45,417 0.011 -0.007 0.411 -6.861 -0.229 0.226 3.441 ROE 45,417 -0.119 0.064 0.852 -5.966 -0.102 0.146 5.615 LEV 45,417 0.180 0.140 0.182 0 0.003 0.295 0.950 ACCRUALS 45,417 0.006 0.008 0.088 -0.959 -0.018 0.039 0.477 BM 45,417 -0.578 -0.629 1.096 -3.253 -1.207 -0.071 3.492 DTURNOVER 45,417 0.019 0.004 1.020 -3.224 -0.322 0.350 3.133 LNSIZE 45,417 5.997 6.014 2.077 0.339 4.501 7.455 10.295 KUR 45,417 7.761 3.818 11.237 -0.149 1.890 8.339 65.049 SIGMA 45,417 0.034 0.029 0.019 0.006 0.021 0.042 0.122 CUM_RET 45,417 0.140 0.052 0.620 -0.840 -0.239 0.366 2.714

Panel B. Pearson correlations

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (1) P_TOPIC1 1.00 (2) P_TOPIC2 0.03 1.00 (3) P_TOPIC3 0.23 0.13 1.00 (4) P_TOPIC4 0.22 0.05 0.37 1.00 (5 P_TOPIC5 0.34 -0.01 0.09 0.15 1.00 (6) P_TOPIC6 0.23 0.07 0.23 0.17 -0.02 1.00 (7) P_TOPIC7 0.04 0.13 0.11 0.12 0.00 0.08 1.00 (8) P_TOPIC8 0.13 0.03 0.34 0.37 0.12 0.11 0.08 1.00 (9) LEAD_NCSKEW 0.04 0.01 0.03 0.04 0.03 0.03 0.00 0.03 1.00 (10) LEAD_DUVOL 0.02 0.00 0.02 0.02 0.01 0.02 0.00 0.02 0.61 1.00 (11) KUR 0.02 -0.06 -0.03 -0.01 0.04 -0.02 -0.02 -0.01 -0.01 -0.01 (12) SIGMA -0.05 -0.10 -0.14 -0.09 0.02 -0.10 -0.05 -0.11 -0.12 -0.07 (13) CUM_RET 0.03 0.01 0.03 0.03 0.01 0.03 0.02 0.02 0.05 0.02 (14) BM -0.14 -0.04 -0.12 -0.14 -0.18 -0.04 -0.05 -0.12 -0.08 -0.02 (15) LEV -0.03 0.08 0.01 -0.08 -0.19 0.17 0.01 -0.07 0.00 0.00 (16) ROE 0.02 0.06 0.07 0.04 -0.05 0.07 0.04 0.06 0.06 0.04 (17) LNSIZE 0.06 0.19 0.19 0.10 0.02 0.09 0.04 0.15 0.15 0.05 (18) DTURNOVER 0.01 0.02 0.00 0.00 -0.01 0.02 0.00 -0.01 0.03 0.00 (19) ACCRUALS 0.03 0.00 0.02 0.02 0.02 0.02 0.01 0.03 0.04 0.00

41

Table 3. Topic Assignments/Proportions and Firm Characteristics

Panel A. Topic assignments VARIABLES A_TOPIC1 A_TOPIC2 A_TOPIC3 A_TOPIC4 A_TOPIC5 A_TOPIC6 A_TOPIC7 A_TOPIC8 KUR -0.000 -0.001 0.000 -0.001 0.000 0.000 -0.000 0.000 (-0.10) (-1.64) (0.19) (-0.89) (0.37) (0.30) (-0.48) (0.08) SIGMA -2.357*** -1.919*** -2.465*** -2.431*** -1.380** -4.025*** -1.616*** -1.944*** (-4.48) (-3.41) (-4.64) (-4.62) (-2.46) (-7.46) (-2.94) (-3.64) CUM_RET -0.084*** -0.065*** -0.093*** -0.078*** -0.100*** -0.050*** -0.049*** -0.089*** (-6.99) (-5.20) (-7.79) (-6.58) (-7.93) (-4.16) (-4.01) (-7.35) BM -0.204*** -0.157*** -0.190*** -0.209*** -0.225*** -0.163*** -0.192*** -0.192*** (-28.17) (-20.74) (-26.25) (-28.88) (-28.19) (-22.52) (-25.21) (-26.14) LEV -0.176*** -0.127*** -0.216*** -0.365*** -1.159*** 0.440*** -0.401*** -0.463*** (-4.82) (-3.32) (-5.88) (-9.95) (-28.99) (11.98) (-10.47) (-12.40) ROE 0.043*** 0.027*** 0.036*** 0.040*** 0.025*** 0.055*** 0.038*** 0.036*** (5.05) (3.06) (4.17) (4.75) (2.82) (6.30) (4.30) (4.15) LNSIZE 0.085*** 0.074*** 0.102*** 0.093*** 0.066*** 0.055*** 0.027*** 0.102*** (19.94) (16.37) (23.74) (21.78) (14.47) (12.64) (6.14) (23.49) DTURNOVER 0.004 0.011 0.005 0.006 -0.010 0.020*** 0.008 0.000 (0.64) (1.64) (0.81) (0.92) (-1.49) (3.05) (1.22) (0.01) ACCRUALS -0.020 -0.189** -0.094 -0.078 0.101 -0.070 0.036 -0.063 (-0.26) (-2.41) (-1.27) (-1.06) (1.31) (-0.92) (0.47) (-0.85)

INDUSTRY FE YES YES YES YES YES YES YES YES YEAR FE YES YES YES YES YES YES YES YES OBSERVATIONS 45,400 45,400 45,400 45,400 45,400 45,400 45,400 45,400 PSEUDO R2 0.078 0.052 0.077 0.082 0.101 0.068 0.048 0.080

Panel B. Topic proportions VARIABLES P_TOPIC1 P_TOPIC2 P_TOPIC3 P_TOPIC4 P_TOPIC5 P_TOPIC6 P_TOPIC7 P_TOPIC8 KUR 0.000 -0.000*** 0.000 -0.000 0.000 0.000 -0.000 0.000 (1.18) (-3.79) (1.26) (-1.34) (1.04) (0.86) (-1.17) (1.12) SIGMA 0.095 0.102* -0.005 -0.061 0.153** -0.340*** 0.087* -0.031 (1.53) (1.90) (-0.11) (-1.03) (2.44) (-5.58) (1.71) (-0.59) CUM_RET 0.002** -0.002** -0.002*** 0.001 -0.002** 0.004*** 0.001 -0.002** (2.00) (-2.09) (-3.07) (0.82) (-2.48) (4.37) (1.33) (-2.28) BM -0.001 0.002** 0.001** -0.002** -0.005*** 0.005*** -0.002** 0.002* (-0.96) (2.21) (2.33) (-2.44) (-5.61) (4.86) (-2.26) (1.87) LEV 0.004 0.011* 0.004 -0.033*** -0.090*** 0.134*** -0.012** -0.019*** (0.62) (1.74) (0.88) (-5.66) (-14.53) (16.19) (-1.99) (-3.38) ROE 0.000 -0.001 -0.001** -0.000 -0.003*** 0.004*** 0.001 0.000 (0.51) (-0.86) (-2.03) (-0.24) (-3.71) (4.47) (1.27) (0.27) LNSIZE -0.003*** 0.006*** 0.004*** -0.002*** -0.001* -0.004*** -0.003*** 0.004*** (-4.85) (6.01) (7.96) (-3.44) (-1.72) (-5.71) (-4.68) (5.84) DTURNOVER -0.000 0.001 -0.000 0.000 -0.001*** 0.002*** -0.001 -0.001 (-0.50) (1.31) (-0.40) (0.77) (-2.87) (4.40) (-1.37) (-1.25) ACCRUALS 0.016*** -0.019*** -0.014*** 0.003 0.013** -0.006 0.002 0.004 (2.65) (-4.27) (-3.77) (0.58) (2.36) (-1.00) (0.44) (0.76) MISS -0.170*** -0.085*** -0.110*** -0.171*** -0.111*** -0.139*** -0.078*** -0.134*** (-64.02) (-29.62) (-59.62) (-69.70) (-40.39) (-49.86) (-32.45) (-52.67) INDUSTRY FE YES YES YES YES YES YES YES YES YEAR FE YES YES YES YES YES YES YES YES OBSERVATIONS 45,417 45,417 45,417 45,417 45,417 45,417 45,417 45,417 ADJUSTED R2 0.383 0.205 0.368 0.413 0.283 0.318 0.182 0.341

Notes: This table reports relations between the disclosure of each topic and our control variables that are known to predict future crash risk. A_TOPICs are topic assignments, and P_TOPICs are topic proportions. Panel A reports the results of Probit regressions of topic assignments and Panel B reports results of OLS regressions of topic proportions. All regressions include Fama-French 48 industry dummies and year fixed effects. MISS is the indicator variable that takes the value of one for firms-years without topics assignments or topic proportions due to very short market risk disclosures in Item 7A (with 100 words or less). t-statistics are reported in parentheses with robust standard errors clustered at the firm level. Continuous variables are winsorized at 1% and 99% level. ***, **, and * indicate significance at the 1%, 5%, and 10% levels, respectively.

42

Table 4. Latent Topics and Future Crash Risk

Panel A. Lead NCSKEW on each topic assignment without control variables VARIABLES (1) (2) (3) (4) (5) (6) (7) (8) A_TOPIC1 0.003 (0.11) A_TOPIC2 -0.048** (-2.16) A_TOPIC3 0.055* (1.86) A_TOPIC4 -0.018 (-0.53) A_TOPIC5 0.037* (1.66) A_TOPIC6 0.036 (1.41) A_TOPIC7 -0.069*** (-3.20) A_TOPIC8 0.017 (0.62) NCSKEW 0.039*** 0.039*** 0.039*** 0.039*** 0.039*** 0.039*** 0.039*** 0.039*** (6.36) (6.33) (6.36) (6.35) (6.36) (6.35) (6.31) (6.36) MISS -0.151*** -0.183*** -0.107*** -0.170*** -0.131*** -0.127*** -0.196*** -0.140*** (-4.71) (-8.87) (-3.66) (-5.07) (-6.42) (-5.16) (-9.69) (-5.25) INDUSTRY FE YES YES YES YES YES YES YES YES YEAR FE YES YES YES YES YES YES YES YES OBSERVATIONS 40,184 40,184 40,184 40,184 40,184 40,184 40,184 40,184 ADJUSTED R2 0.012 0.012 0.012 0.012 0.012 0.012 0.013 0.012

Panel B. Lead DUVOL on each topic assignment without control variables VARIABLES (1) (2) (3) (4) (5) (6) (7) (8) A_TOPIC1 -0.003 (-0.36) A_TOPIC2 -0.016** (-2.46) A_TOPIC3 0.012 (1.42) A_TOPIC4 -0.006 (-0.58) A_TOPIC5 0.003 (0.46) A_TOPIC6 0.014* (1.91) A_TOPIC7 -0.011* (-1.75) A_TOPIC8 0.002 (0.20) DUVOL 0.073*** 0.073*** 0.073*** 0.073*** 0.073*** 0.073*** 0.073*** 0.073*** (11.22) (11.20) (11.22) (11.22) (11.22) (11.21) (11.21) (11.22) MISS -0.031*** -0.037*** -0.018* -0.033*** -0.026*** -0.017* -0.035*** -0.027*** (-3.06) (-4.72) (-1.88) (-3.06) (-3.80) (-1.89) (-4.34) (-3.02) INDUSTRY FE YES YES YES YES YES YES YES YES YEAR FE YES YES YES YES YES YES YES YES OBSERVATIONS 40,177 40,177 40,177 40,177 40,177 40,177 40,177 40,177 ADJUSTED R2 0.007 0.007 0.007 0.007 0.007 0.007 0.007 0.007

43

Panel C. Lead NCSKEW on each topic proportion without control variables VARIABLES (1) (2) (3) (4) (5) (6) (7) (8) P_TOPIC1 0.070 (1.07) P_TOPIC2 -0.085 (-1.47) P_TOPIC3 0.070 (0.80) P_TOPIC4 0.071 (1.00) P_TOPIC5 0.065 (0.95) P_TOPIC6 -0.014 (-0.22) P_TOPIC7 -0.239*** (-3.42) P_TOPIC8 0.047 (0.64) NCSKEW 0.039*** 0.039*** 0.039*** 0.039*** 0.039*** 0.039*** 0.039*** 0.039*** (6.36) (6.35) (6.35) (6.35) (6.36) (6.36) (6.32) (6.35) MISS -0.142*** -0.162*** -0.146*** -0.142*** -0.147*** -0.156*** -0.172*** -0.148*** (-7.84) (-9.79) (-7.98) (-7.45) (-8.96) (-8.95) (-10.67) (-8.24) INDUSTRY FE YES YES YES YES YES YES YES YES YEAR FE YES YES YES YES YES YES YES YES OBSERVATIONS 40,184 40,184 40,184 40,184 40,184 40,184 40,184 40,184 ADJUSTED R2 0.012 0.012 0.012 0.012 0.012 0.012 0.013 0.012

Panel D. Lead DUVOL on each topic proportion without control variables VARIABLES (1) (2) (3) (4) (5) (6) (7) (8) P_TOPIC1 0.012 (0.64) P_TOPIC2 -0.034* (-1.81) P_TOPIC3 0.016 (0.62) P_TOPIC4 0.022 (1.09) P_TOPIC5 -0.001 (-0.06) P_TOPIC6 0.017 (0.94) P_TOPIC7 -0.050** (-2.38) P_TOPIC8 0.012 (0.59) DUVOL 0.073*** 0.073*** 0.073*** 0.073*** 0.073*** 0.073*** 0.072*** 0.073*** (11.22) (11.20) (11.22) (11.22) (11.22) (11.21) (11.21) (11.22) MISS -0.026*** -0.031*** -0.026*** -0.024*** -0.028*** -0.026*** -0.032*** -0.026*** (-3.75) (-4.34) (-3.57) (-3.43) (-4.33) (-3.53) (-4.52) (-3.78) INDUSTRY FE YES YES YES YES YES YES YES YES YEAR FE YES YES YES YES YES YES YES YES OBSERVATIONS 40,177 40,177 40,177 40,177 40,177 40,177 40,177 40,177 ADJUSTED R2 0.007 0.007 0.007 0.007 0.007 0.007 0.007 0.007

Notes: This table reports OLS regressions of future crash risk, Eq. (4). A_TOPICs are topic assignments, and P_TOPICs are topic proportions. Dependent variables in Panels A and C are NCSKEW in year t+1, those in Panels B and C are DUVOL in year t+1. Panels A and B show the relationship between topic assignments and future stock price crash risk. Panels C and D show the relationship between topic proportions and future stock price crash risk. All regressions include Fama-French 48 industry dummies and year fixed effects. MISS is the indicator variable that takes the value of one for firms-years without topics assignments or topic proportions due to very short market risk disclosures in Item 7A (with 100 words or less). t-statistics are reported in parentheses with robust standard errors clustered at the firm level. Continuous variables are winsorized at 1% and 99% level. ***, **, and * indicate significance at the 1%, 5%, and 10% levels, respectively.

44

Table 5. Latent Topics and Future Crash Risk (with Control Variables)

Dependent Variables: Lead_NCSKEW Lead_DUVOL Lead_NCSKEW Lead_DUVOL VARIABLES (1) (2) (3) (4) (5) (6) (7) (8) A_TOPIC2 -0.052** -0.016** (-2.35) (-2.55) P_TOPIC2 -0.213*** -0.050** (-3.58) (-2.56) A_TOPIC7 -0.039* -0.007 (-1.78) (-1.08) P_TOPIC7 -0.133* -0.032 (-1.90) (-1.58) NCSKEW 0.045*** 0.045*** 0.045*** 0.045*** (7.02) (7.00) (7.05) (7.05) DUVOL 0.071*** 0.071*** 0.071*** 0.071*** (10.39) (10.36) (10.40) (10.40) MISS -0.092*** -0.079*** -0.022*** -0.017** -0.084*** -0.071*** -0.017** -0.015** (-4.48) (-4.83) (-3.04) (-2.54) (-4.17) (-4.41) (-2.35) (-2.41) KUR 0.002** 0.002** 0.001* 0.001* 0.002** 0.002** 0.001* 0.001* (2.11) (2.08) (1.88) (1.88) (2.13) (2.12) (1.89) (1.89) SIGMA -5.053*** -5.032*** -2.785*** -2.781*** -5.053*** -5.040*** -2.785*** -2.782*** (-8.66) (-8.62) (-6.72) (-6.72) (-8.65) (-8.63) (-6.72) (-6.72) CUM_RET 0.131*** 0.130*** 0.042*** 0.042*** 0.131*** 0.131*** 0.042*** 0.042*** (9.34) (9.31) (9.29) (9.28) (9.37) (9.37) (9.30) (9.30) BM -0.036*** -0.035*** -0.009** -0.009** -0.036*** -0.036*** -0.009** -0.009** (-4.98) (-4.91) (-2.39) (-2.35) (-5.03) (-5.01) (-2.40) (-2.40) LEV -0.077* -0.076* -0.007 -0.007 -0.081* -0.080* -0.008 -0.008 (-1.73) (-1.72) (-0.37) (-0.37) (-1.82) (-1.80) (-0.42) (-0.41) ROE 0.014 0.014 0.008 0.008 0.015 0.015 0.008 0.008 (1.38) (1.37) (0.66) (0.66) (1.39) (1.39) (0.67) (0.67) LNSIZE 0.063*** 0.064*** -0.001 -0.001 0.062*** 0.062*** -0.001 -0.001 (13.67) (13.89) (-0.63) (-0.49) (13.49) (13.54) (-0.70) (-0.69) DTURNOVER 0.033*** 0.033*** 0.003 0.003 0.033*** 0.033*** 0.003 0.003 (4.44) (4.46) (1.16) (1.16) (4.44) (4.42) (1.15) (1.14) ACCRUALS 0.258*** 0.256*** -0.064 -0.064 0.261*** 0.260*** -0.063 -0.064 (3.02) (3.01) (-0.68) (-0.68) (3.06) (3.05) (-0.67) (-0.67) INDUSTRY FE YES YES YES YES YES YES YES YES YEAR FE YES YES YES YES YES YES YES YES OBSERVATIONS 40,184 40,184 40,177 40,177 40,184 40,184 40,177 40,177 ADJUSTED R2 0.036 0.036 0.014 0.014 0.036 0.036 0.014 0.014

Notes: This table reports OLS regressions of future crash risk, Eq. (4). A_TOPICs are topic assignments, and P_TOPICs are topic proportions. Dependent variables in Panels A and C are NCSKEW in year t+1, those in Panels B and C are DUVOL in year t+1. Regressions in this table include additional control variables that are known to predict future crash risk. This table focuses on the effects of Topic 2 (commodity price risk and derivatives) and Topic 7 (risks in product prices and materials costs). All regressions include Fama-French 48 industry dummies and year fixed effects. MISS is the indicator variable that takes the value of one for firms-years without topics assignments or topic proportions due to very short market risk disclosures in Item 7A (with 100 words or less). t-statistics are reported in parentheses with robust standard errors clustered at the firm level. Continuous variables are winsorized at 1% and 99% level. ***, **, and * indicate significance at the 1%, 5%, and 10% levels, respectively.

45

Table 6. Grouping by Accruals: The Informativeness of Topic 2 (Commodity Price Risk and Derivatives)

Dependent Variables: Lead_NCSKEW Lead_DUVOL High Accruals Low Accruals High Accruals Low Accruals VARIABLES (1) (2) (3) (4) (5) (6) (7) (8) A_TOPIC2 -0.007 -0.098*** -0.003 -0.030*** (-0.23) (-3.15) (-0.31) (-3.21) P_TOPIC2 -0.241*** -0.177** -0.049* -0.050* (-2.96) (-2.15) (-1.89) (-1.70) NCSKEW 0.045*** 0.045*** 0.045*** 0.045*** (4.82) (4.79) (5.03) (5.05) DUVOL 0.073*** 0.073*** 0.069*** 0.069*** (8.41) (8.39) (6.51) (6.51) KUR 0.002 0.002 0.002 0.002 0.001*** 0.001*** 0.001 0.001 (1.46) (1.44) (1.56) (1.56) (2.60) (2.59) (1.06) (1.07) SIGMA -5.255*** -5.229*** -5.072*** -5.050*** -3.082*** -3.078*** -2.692*** -2.686*** (-6.14) (-6.11) (-6.43) (-6.40) (-10.97) (-10.96) (-3.98) (-3.98) CUM_RET 0.152*** 0.151*** 0.106*** 0.106*** 0.046*** 0.046*** 0.036*** 0.036*** (7.76) (7.73) (5.43) (5.41) (8.06) (8.04) (5.28) (5.28) BM -0.033*** -0.033*** -0.034*** -0.034*** -0.007** -0.007** -0.009 -0.009 (-3.29) (-3.23) (-3.55) (-3.49) (-2.40) (-2.35) (-1.48) (-1.45) LEV -0.086 -0.082 -0.061 -0.063 -0.005 -0.004 -0.009 -0.010 (-1.33) (-1.27) (-1.01) (-1.05) (-0.25) (-0.21) (-0.28) (-0.31) ROE 0.038** 0.037** 0.001 0.002 0.024*** 0.024*** -0.002 -0.002 (2.32) (2.31) (0.11) (0.12) (5.02) (5.01) (-0.09) (-0.09) LNSIZE 0.063*** 0.064*** 0.064*** 0.065*** -0.002 -0.002 0.001 0.001 (9.56) (9.75) (10.47) (10.54) (-1.06) (-0.91) (0.21) (0.27) DTURNOVER 0.040*** 0.040*** 0.022** 0.021** 0.005 0.005 0.001 0.001 (3.91) (3.92) (2.06) (2.04) (1.54) (1.55) (0.19) (0.17) ACCRUALS 0.312* 0.308* 0.034 0.036 0.051 0.051 -0.200 -0.199 (1.69) (1.67) (0.25) (0.26) (0.93) (0.92) (-1.15) (-1.14) MISS -0.070** -0.086*** -0.115*** -0.072*** -0.017** -0.019*** -0.028** -0.015 (-2.36) (-3.68) (-4.00) (-3.14) (-2.03) (-2.87) (-2.36) (-1.31) INDUSTRY FE YES YES YES YES YES YES YES YES YEAR FE YES YES YES YES YES YES YES YES OBSERVATIONS 20,164 20,164 20,020 20,020 20,160 20,160 20,017 20,017 ADJUSTED R2 0.034 0.034 0.036 0.036 0.032 0.032 0.008 0.007

Notes: This table repeats the analysis in the previous table by splitting the sample into two subsamples groups by working capital accruals of Frankel and Sun (2018). This table focuses on the information content of Topic 2 (Commodity Price Risk and Derivatives). High Accruals (top-half) subsample consists of firm-year observations with accruals that are above the corresponding industry medians. Low Accruals (bottom-half) subsample includes the rest. All regressions include Fama-French 48 industry dummies and year fixed effects. MISS is the indicator variable that takes the value of one for firms-years without topics assignments or topic proportions due to very short market risk disclosures in Item 7A (with 100 words or less). t-statistics are reported in parentheses with robust standard errors clustered at the firm level. Continuous variables are winsorized at 1% and 99% level. ***, **, and * indicate significance at the 1%, 5%, and 10% levels, respectively.

46

Table 7: Grouping by Accruals: The Informativeness of Topic 7 (Risks in Product Prices and Materials Costs)

Dependent Variables: Lead_NCSKEW Lead_DUVOL High Accruals Low Accruals High Accruals Low Accruals VARIABLES (1) (2) (3) (4) (5) (6) (7) (8) A_TOPIC7 -0.039 -0.044 -0.006 -0.008 (-1.23) (-1.40) (-0.72) (-0.83) P_TOPIC7 -0.251** -0.022 -0.054* -0.011 (-2.36) (-0.22) (-1.87) (-0.38) NCSKEW 0.045*** 0.045*** 0.045*** 0.045*** (4.81) (4.81) (4.81) (5.09) DUVOL 0.073*** 0.073*** 0.069*** 0.069*** (8.41) (8.41) (6.55) (6.55) KUR 0.002 0.002 0.002 0.002 0.001*** 0.001*** 0.001 0.001 (1.46) (1.46) (1.46) (1.60) (2.60) (2.60) (1.08) (1.08) SIGMA -5.255*** -5.239*** -5.255*** -5.071*** -3.083*** -3.079*** -2.692*** -2.691*** (-6.14) (-6.12) (-6.14) (-6.42) (-10.97) (-10.96) (-3.98) (-3.98) CUM_RET 0.152*** 0.152*** 0.152*** 0.106*** 0.046*** 0.046*** 0.036*** 0.036*** (7.77) (7.78) (7.77) (5.44) (8.06) (8.07) (5.29) (5.28) BM -0.034*** -0.034*** -0.034*** -0.034*** -0.007** -0.007** -0.009 -0.009 (-3.34) (-3.33) (-3.34) (-3.53) (-2.42) (-2.42) (-1.48) (-1.47) LEV -0.089 -0.090 -0.089 -0.064 -0.005 -0.006 -0.011 -0.010 (-1.38) (-1.39) (-1.38) (-1.06) (-0.28) (-0.30) (-0.32) (-0.31) ROE 0.038** 0.038** 0.038** 0.002 0.024*** 0.024*** -0.002 -0.002 (2.34) (2.34) (2.34) (0.12) (5.03) (5.03) (-0.09) (-0.09) LNSIZE 0.062*** 0.062*** 0.062*** 0.064*** -0.002 -0.002 0.001 0.001 (9.46) (9.47) (9.46) (10.37) (-1.10) (-1.12) (0.16) (0.18) DTURNOVER 0.040*** 0.040*** 0.040*** 0.021** 0.005 0.005 0.001 0.001 (3.90) (3.87) (3.90) (2.02) (1.54) (1.51) (0.16) (0.16) ACCRUALS 0.315* 0.318* 0.315* 0.036 0.052 0.052 -0.199 -0.199 (1.71) (1.72) (1.71) (0.26) (0.94) (0.95) (-1.14) (-1.14) MISS -0.089*** -0.085*** -0.089*** -0.058** -0.019** -0.019*** -0.015 -0.011 (-3.05) (-3.67) (-3.05) (-2.57) (-2.30) (-2.93) (-1.32) (-1.10) INDUSTRY FE YES YES YES YES YES YES YES YES YEAR FE YES YES YES YES YES YES YES YES OBSERVATIONS 20,164 20,164 20,020 20,020 20,160 20,160 20,017 20,017 ADJUSTED R2 0.034 0.034 0.036 0.036 0.032 0.032 0.007 0.007

Notes: This table repeats the analysis in the previous table by splitting the sample into two subsamples groups by working capital accruals of Frankel and Sun (2018). This table focuses on the information content of Topic 7 (risks in product prices and materials costs). High Accruals (top-half) subsample consists of firm-year observations with accruals that are above the corresponding industry medians. Low Accruals (bottom-half) subsample includes the rest. All regressions include Fama-French 48 industry dummies and year fixed effects. MISS is the indicator variable that takes the value of one for firms-years without topics assignments or topic proportions due to very short market risk disclosures in Item 7A (with 100 words or less). t-statistics are reported in parentheses with robust standard errors clustered at the firm level. Continuous variables are winsorized at 1% and 99% level. ***, **, and * indicate significance at the 1%, 5%, and 10% levels, respectively.

Documents

Market Risk Disclosure and Crash Risk: Evidence from Textual Analysis Risk... · 2019-01-18 · Evidence from Textual Analysis : Abstract : The usefulness of market risk disclosures,