41
Version of 3/19/2018 Abiding by the Law: Using Benford’s Law to Examine Nonprofit Financial Reports Heng Qu* The Bush School of Government and Public Service Texas A&M University Richard Steinberg** Department of Economics and Lilly Family School of Philanthropy IUPUI Ronelle Burger*** Department of Economics Stellenbosch University Abstract: Tax-exempt organizations generally must file annual informational re- turns (Form 990) with the Internal Revenue Service (IRS). Funders, charity watchdogs, and regulators use this information as their primary source for finan- cial accountability, and scholars rely on the data to evaluate and understand non- profit behavior. The IRS is charged with protecting the integrity of 990 data but lacks resources and rarely audits these returns. We show that a mathematical odd- ity known as Benford’s Law can flag data as suspicious and prioritize candidates for more detailed investigation. The results confirm that those organizations pre- dicted to be more likely to file misleading or fabricated data are more likely to be flagged by Benford’s Law. Further methodological refinement will enable Ben- ford’s Law to become an effective screening tool for nonprofit financial reports. * Corresponding Author. The Bush School of Government and Public Service, Texas A&M Uni- versity, 4220 TAMU, College Station, TX 77843-4220. USA. [email protected] ** Dept. of Economics, IUPUI, Indianapolis, IN 46236. [email protected] *** Dept. of Economics, Stellenbosch University. Private Bag X1, Matieland, 7602, Stellen- bosch, South Africa. [email protected]

Abiding by the Law: Using Benford’s Law to Examine

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Version of 3/19/2018

Abiding by the Law: Using Benford’s Law to Examine Nonprofit Financial

Reports

Heng Qu*

The Bush School of Government and Public Service

Texas A&M University

Richard Steinberg**

Department of Economics and Lilly Family School of Philanthropy

IUPUI

Ronelle Burger***

Department of Economics

Stellenbosch University

Abstract: Tax-exempt organizations generally must file annual informational re-

turns (Form 990) with the Internal Revenue Service (IRS). Funders, charity

watchdogs, and regulators use this information as their primary source for finan-

cial accountability, and scholars rely on the data to evaluate and understand non-

profit behavior. The IRS is charged with protecting the integrity of 990 data but

lacks resources and rarely audits these returns. We show that a mathematical odd-

ity known as Benford’s Law can flag data as suspicious and prioritize candidates

for more detailed investigation. The results confirm that those organizations pre-

dicted to be more likely to file misleading or fabricated data are more likely to be

flagged by Benford’s Law. Further methodological refinement will enable Ben-

ford’s Law to become an effective screening tool for nonprofit financial reports.

* Corresponding Author. The Bush School of Government and Public Service, Texas A&M Uni-

versity, 4220 TAMU, College Station, TX 77843-4220. USA. [email protected]

** Dept. of Economics, IUPUI, Indianapolis, IN 46236. [email protected]

*** Dept. of Economics, Stellenbosch University. Private Bag X1, Matieland, 7602, Stellen-

bosch, South Africa. [email protected]

Abiding by the Law: Using Benford’s Law to Examine Nonprofit Financial

Reports

Abstract: Tax-exempt organizations generally must file annual informational returns (Form 990)

with the Internal Revenue Service (IRS). Funders, charity watchdogs, and regulators use this in-

formation as their primary source for financial accountability, and scholars rely on the data to

evaluate and understand nonprofit behavior. The IRS is charged with protecting the integrity of

990 data but lacks resources and rarely audits these returns. We show that a mathematical oddity

known as Benford’s Law can flag data as suspicious and prioritize candidates for more detailed

investigation. The results confirm that those organizations predicted to be more likely to file mis-

leading or fabricated data are more likely to be flagged by Benford’s Law. Further methodologi-

cal refinement will enable Benford’s Law to become an effective regulatory tool for nonprofit

financial reports.

Practitioner Points:

Benford’s Law is useful for flagging suspicious financial data, which can be used as a

mechanism to enhance the reliability of nonprofit financial data, thus promoting account-

ability and transparency in the nonprofit sector.

We suggest a neglected statistical test and argue that it is less likely to produce false neg-

atives than conventional tests used in Benford analysis.

Besides nonprofit financial reports, Benford’s Law can also be applied to many other

kinds of government data.

Nonprofit organizations offer an important alternative institutional form for delivering

welfare, health, education, arts and culture, international, religious, and other social services.

Several agencies are charged with oversight of U.S. nonprofit organizations, including the Inter-

nal Revenue Service (IRS), the National Association of State Charity Officials (NASCO), and

more informally a number of peer review and monitoring bodies. Parallel to debates on transpar-

ency in public spheres, there are concerns that the effectiveness of these oversight channels are

impeded by weaknesses in the Form 990 data.

Most U.S. nonprofits with gross receipts exceeding $25,000 must file an informational

return each year (Form 990 or 990-EZ) with the IRS and make this form publicly available.1 This

information disclosure is an important mechanism for nonprofit accountability. The IRS (2017,

1) explains the purpose in its instructions: “Some members of the public rely on Form 990 or

Form 990-EZ as their primary or sole source of information about a particular organization. How

the public perceives an organization in such cases can be determined by information presented

on its return.” Form 990 contains extensive financial information as well as categorical infor-

mation on activities and governance. Major funders require that their applicants submit copies of

their Forms 990, charitable ratings services (e.g., Charity Navigator, the Better Business Bu-

reau’s Wise Giving Alliance, and Charity Watch) use 990 data to evaluate nonprofit financial ac-

countability, and Guidestar posts the Forms 990 of all registered nonprofits to advance transpar-

ency and inform donor decisions. Finally, the comprehensive panels of 990 data are a goldmine

for scholars who advance knowledge on nonprofit behavior.

Therefore, the reliability of 990 data is important. Yet, there are many problems. Prior re-

search finds that some Form 990 reports may violate generally accepted accounting principles

(GAAP), containing errors and reporting biases (e.g. Keating and Frumkin 2003). Keating and

Frumkin (2003) conclude that weaknesses in Form 990 data are due to structural factors such as

the limited market for nonprofit information (e.g. no stock market of investors) and heterogene-

ity in the needs of data users. They also argue that while greater transparency and credibility may

be socially optimal – improving the reputation of the sector and consequently increasing invest-

ments and donations in the sector – it is never beneficial for an individual nonprofit to disclose

more in a competitive funding environment.

In particular, Form 990 requires nonprofits to divide their total expenditures into three

functional categories: program, fundraising, and administrative. Many studies reveal the misre-

porting of functional expenses. Donors, regulators, charity watchdogs, the media, and the public

have paid the most attention to program ratio (i.e. the ratio of program to total expenses) as an

important metric for nonprofit performance (see Garven, Hofmann, and McSwain 2016 for a re-

view). Therefore, nonprofits have the incentive to improve their program ratio by reallocating

(thereby underreporting) some fundraising and/or administrative expenses to program expenses

(thereby overreporting). Studies confirm that the patterns of functional expense reporting are

consistent with stakeholder-impression management, as the reported program ratios are exces-

sive while the reported ratios of fundraising (or administrative) to total expenditures are too low

(Trussel 2003; Wing et al. 2006; Jones and Roberts 2006; Krishnan, Yetman, and Yetman 2006;

Keating, Parsons, and Roberts 2008; Krishnan and Yetman 2011). Other studies find that non-

profits may also manipulate data to reduce their unrelated-business income tax obligations (Omer

and Yetman 2003; 2007). Some support their conclusions by comparing 990 data with audited

financial statements (Froelich, Knoepfle, and Pollak 2000; Keating and Frumkin 2003; Krishnan

et al 2006; Burks 2015). Others compare annual reports to state regulatory agencies with 990

data (Keating et al 2008; Krishnan and Yetman 2011). None use the approach taken in this arti-

cle, comparing the distribution of reported numbers with that predicted by mathematical theory.

The Internal Revenue Service (IRS) is charged with protecting the integrity of 990 data,

but lacks resources and rarely audits Forms 990 (Mayer 2016). In this article, we verify the ap-

plicability of Benford’s Law to flag suspicious data and prioritize candidates for a more detailed

investigation. Benford’s Law states that the leading digits of most kinds of naturally generated

numerical data are not equally likely to occur – it is far more likely that the first digit will be a 1

(30% of the time) than a 9 (less than a 5% of the time). The leading digits of made-up numbers

have different distributions, so we test for data integrity by seeing whether the frequencies of

leading 1s, 2s, 3s, etc. in Form 990 data differs from the Benford frequencies.

Some of the deviations from the Benford distribution are more excusable than others, re-

flecting differing reporting periods, accounting methods, or transcription errors. However, the

differences that reflect conscious manipulation of the data, designed to manage stakeholder per-

ceptions or to elude taxes and regulations, are of far greater concern. Benford analysis can signal

that such abuses are present.

We analyze compliance with Benford’s law at multiple levels– across the whole sample,

by selected variables, and by individual organizations, using a panel of 990 data from 1998 to

2003. We develop hypotheses around three broad institutional characteristics that prior literature

suggests would make nonprofits more likely to deviate from Benford’s law, namely: 1) philan-

thropic amateurism, 2) compliance with public perception, and 3) lack of funder oversight. Our

statistical tests and regression analysis provide support for all three sets of hypotheses.

In the next section, we provide background on Benford’s Law. Then we develop our hy-

potheses, describe the data and method. The following section presents results, and we conclude

with a discussion of implications for government practice.

Benford’s Law

Paging through library copies of logarithm tables, the American astronomer Simon New-

comb observed that the pages at front dealing with the low digits were far more worn out than

those in back. Based on this, he formulated a hypothesis that the ten digits do not occur with

equal frequency, with more numbers beginning with the lowest digits (1 or 2) than with the high-

est digits (8 or 9). Specifically, “the law of probability of the occurrence of numbers is such that

all mantissae of their logarithms are equally likely” (Newcomb, 1881). Given this, the frequency

of “natural” numbers beginning with digit 1 should be about 30% of the time, and that with digit

2 should be about 18% of the time, and so on, the frequency of numbers with leading digit 8

should be 5.1%, and that with leading digit 9 should be 4.6%. This phenomenon was later re-

ferred to as Benford’s Law.

Newcomb did not provide any theoretical explanation and his discovery did not raise much

attention. Until 1938, when Frank Benford, a physicist with the General Electric Company, noted

the same pattern again. He went further to confirm this hypothesis by experimenting on 20 dif-

ferent types of “natural” numbers (20,229 observations), ranging from square roots of integers,

atomic weights, population sizes, surface areas of rivers, and death rates, to street addresses of

American Men of Science, and every number from an issue of Readers Digest (Benford, 1938).

For a long time, Benford’s Law was deemed a mysterious law of nature. The mathematician

Theodore Hill (1995) gave the first rigorous theoretical explanations of the law, followed by later

scholars that continued his unfinished work. Although the mathematical proof of Benford’s Law

is complicated, the natural relationship between growth rates and the log of first digits provides

some intuition. Suppose the total revenue of an organization is $1,000,000. Then it must grow

100% before it reaches $2,000,000 (leading digit of 2), 50% for the leading digit to be 3, and

only 12.5% to raise a leading digit 8 to 9. This makes 1 more common than any other digit

(Durtschi, Hillison, and Pacini 2004).

Practical applications of Bedford’s Law did not arrive until the 1970s. Hal Varian (1972)

first suggested that Benford’s Law could be used to verify the plausibility of economic forecast

data, finding that both the input data and forecasts were consistent with the Benford distribution.

Extending his insight, later researchers compared the first-digit frequency distribution of a da-

taset with the Benford’s Law distribution to test whether the data is naturally generated or made

up. Later, accounting scholars began applying Benford’s Law to detect possibly fraudulent finan-

cial data (e.g. Nigrini 1996), and digital analysis based on Bedford’s Law has become a standard

tool in forensic accounting to identify suspicious cases for further investigation (Archambault

and Archambault 2011; Collins 2017). In addition to accounting data, Benford’s Law was instru-

mental in detecting that Greece had falsified the macroeconomic data submitted for entry into the

EU (Rauch et al. 2011). Similarly, Benford’s Law provided evidence of fraud in Russian election

results (Myagkov, Ordeshook, and Shakin 2005). It is now widely viewed as a credible approach

to help detect fieldworker cheating in household surveys (Finn and Ranchod 2017; Schräpler

2011), and to flag suspicious self-reported information on compliance with regulations (De

Marchi and Hamilton 2006). Benford’s Law produces false positives and false negatives2 and

therefore deviations from Benford’s Law should be used diagnostically, to direct or prioritize re-

sources for further investigations, rather than probatively (Tam Cho and Gaines 2007).

Hill (1995; 1998) noted that natural data will follow Benford’s law when the numbers are

combinations of naturally-generated data in several steps.3 Most accounting data is a combina-

tion of transactions, so Benford analysis is appropriate (Durtschi et al. 2004). Certain types of

data do not conform to Benford’s Law (Durtschi et al. 2004; Schräpler 2011). First, when there is

a binding maximum or minimum value for the data, it will have a non-Benford distribution. Sec-

ond, assigned numbers, such as zip codes or check numbers, are expected to follow a uniform

distribution rather than a Benford distribution. Numbers influenced by human thought, such as

ATM withdrawal amounts or price setting at $9.99 to exploit psychological thresholds, do not

conform to the Benford distribution. Third, data consisting of single transactions does not obey

the Law, which relies on the combination of transactions.

We have located only one article (Van Caneghem 2016: Belgian nonprofits), two work-

ing papers (Dang and Owens 2016: UK charities; Burger, Dang, and Owens 2017: Ugandan

NGOs), and two blog posts (Galbi 2012; Kane 2011) that apply Benford analysis to nonprofit fi-

nancial data. Van Caneghem (2016) used a variant of Benford that predicts the frequency distri-

bution of the first two digits on Belgian nonprofit data. He found that 0 and 5 appeared as the

second digit much more often than Benford predicted, and that all other digits were observed less

frequently than predicted. He attributes this to roundoff error.4 Because roundoff error is a less

important inaccuracy, where there would usually be no intention to mislead, our Benford analy-

sis is conducted on the leading digit because it cannot be rounded off. In addition, we are among

the first to apply Benford analysis to the U.S. nonprofit tax return data.

Theory and Hypotheses

The reliability of nonprofit financial reports is important, yet prior research provides evi-

dence of misreporting. Some suggest that nonprofit financial misreporting could be uninten-

tional, due to constraints in organizational resources, such as accounting expertise, management

experience, and governance mechanisms (Yetman and Yetman 2013; Keating et al 2008). Oth-

ers imply that misreporting could be a result of intentional managerial manipulation, that is, non-

profit managers have an incentive to understate their administrative and/or fundraising expenses

for improved efficiency ratios (Krishnan et al 2006; Jones and Roberts 2006). We propose eight

hypotheses suggested by prior literature, falling into three categories – philanthropic amateurism,

public perception compliance, and funder oversight.

Philanthropic amateurism versus professionalism. Salamon (1987) proposed that volun-

tary failure should be weighed against market and government failures to decide on the role of

nonprofit organizations in a three-sector economy. He noted that government employees must be

credentialed to ensure the legitimacy of political institutions, and there are strong forces in favor

of professionalism in the for-profit sector as well. In contrast, voluntary actors, rooted in dedica-

tion to mission, may not acquire the same sorts of professional training. For example, performing

arts organizations are often led by former performers rather than those trained in arts manage-

ment and business methods.

This “philanthropic amateurism” is defined here as the lack of accounting sophistication

or management professionalization. First, there are variations in the reported accounting method

on Form 990: cash, accrual, or other. Although not required by the IRS, accrual accounting is

recommended in the Generally Accepted Accounting Principles (GAAP) because it provides a

more accurate picture of a firm’s overall financial health (Keating and Frumkin 2003). Keating et

al. (2008) suggest that the use of GAAP indicates organizations’ accounting sophistication, and

they find that those with less accounting sophistication are more likely to misreport the costs of

telemarketing campaigns on Form 990. Following previous literature, our first hypothesis is:

H1: Organizations employing accrual accounting are more compliant with the Benford

distribution (i.e. proper reporting) than organizations using cash or other accounting methods.

Second, organizations using external professional accountants are more likely to

properly report their financials on Form 990, compared with those with no external professional

accountants and reported zero accounting fees (Keating et al. 2008; Krishnan et al. 2006). Our

second hypothesis is:

H2: Nonprofits using external professional accounting services (i.e. reporting positive ac-

counting fees) are more compliant with the Benford distribution than other organizations.

The third form of philanthropic amateurism is the absence of professional management,

indicated by zero compensation to officers and/or directors. Prior research suggests that the exist-

ence of professional management may reflect a higher ability to produce high-quality financial

reports (Tinkelman 1999). On the other hand, principal-agent theory, which describes difficulties

when a principal works with others (agents) who do not necessarily share the principal’s interest,

provides a counterargument. Here, nonprofit managers (as agents of the nonprofit board) may

misreport some expenses in order to increase their compensation (Krishnan et al. 2006; Baber,

Daniel, and Roberts 2002). Because we do not know which effect is bigger, we test the two-sided

null hypothesis that:

H3: Nonprofits that employ paid officers and/or directors and those relying on volunteer

officers or directors are equally compliant with the Benford distribution.

Public perception compliance. Philanthropic amateurism is one reason for misreporting,

managerial manipulation is another. Rightly or wrongly, donors, charity watchdogs, regulators,

and media all pressure nonprofits to have a low overhead ratio (i.e. fundraising and administra-

tion expenses divided by total expenses) and consequentially, a high program ratio (i.e. program

expenses divided by total expenses). To improve its program ratio and thus lower the overhead

ratio, a nonprofit may misallocate part or all its fundraising and administrative expenses as pro-

gram expenses, or net-out fundraising expenses from funds raised instead of separately reporting

fundraising expenses (Garven et al. 2016). More severely, many nonprofits reported zero fund-

raising expenses (despite receiving substantial contributions) and/or zero administrative ex-

penses. For example, Urban Institute and Center on Philanthropy at Indiana University (2004)

analyzed 220,000 Forms 990 (1999-2004) and reported that 37% of nonprofit organizations with

at least $50,000 in contributions reported no fundraising costs and 25% for those receiving be-

tween $1 million and $5 million in contributions. In addition, 13% of nonprofits reported zero

administrative costs. Similarly, in a sample of 73,107 Forms 990 reporting at least $10,000 in to-

tal expenses (1998-2006), Yetman and Yetman (2013) found 36% of organizations receiving at

least $10,000 in donations reported no fundraising expenses and 3% reported no administrative

expenses. Following prior research, we hypothesize that:

H4: Organizations receiving at least $10,000 in donations while reporting zero fundrais-

ing expenses are less compliant with Benford’s Law than other organizations.

H5: Organizations reporting zero administrative expenses are less compliant with Ben-

ford’s Law than those reporting positive administrative expenses.

Funder oversight and monitoring. There are many principal-agent relationships among

nonprofits and their stakeholders, with the nonprofit sometimes acting as the principal and some-

times as the agent (Steinberg 2010). The manipulation of nonprofit financial reports by managers

as agents of the board as principal was discussed in H3. Here we have a principal-agent problem

between the organization as the agent and outside stakeholders as principals. Donors and funders

use nonprofit organizations to provide services and personal benefits (such as enhanced reputa-

tion and warm glow) and seek accurate information about the use of their contributions. The non-

profits pursue their mission and know that in some cases that mission can be advanced by manip-

ulating financial data to increase grants, contracts, and donations. Monitoring and oversight by

funders reduces this mismatch (Hansmann 1996).

Government agencies, as principals, use various monitoring strategies and oversight re-

quirements to ensure that their agents provide accurate financial feedback regarding the use of

their grant and contract receipts, using financial audits, quarterly fiscal reports, and other tech-

niques (Van Slyke 2007). We therefore hypothesize:

H6: Organizations receiving government grants are more compliant with Benford’s Law

than those that never received government grants.

Nonprofits often receive “indirect public support” through federated fundraising agencies

(e.g. United Way). Federated campaigns often impose audit and other financial accountability

requirements on participating organizations to ensure the reputability of the combined campaign.

For example, the umbrella organization of United Way requires that local United Ways assure

that all local member organizations “undergo annual financial audits conducted by independent

certified public accountants whose examination complies with generally accepted auditing stand-

ards. In addition, United Ways have developed comprehensive requirements for completion of

audited financial statements to ensure consistency and transparency system-wide” (United Way

n.d.). We hypothesize:

H7: Organizations receiving indirect public support are more compliant with Benford’s

Law than those that never received indirect public support.

Finally, some organizations have temporarily or permanently restricted assets. These time

or purpose restrictions are imposed by major donors, who have the motivation and power to exert

more oversight. Prior research provides evidence that organizations with donor-restricted assets

are associated with less misreporting (e.g. Keating et al. 2008; Yetman and Yetman 2013). We

therefore propose:

H8: Organizations receiving temporarily or permanently restricted assets are more com-

pliant with the Benford distribution than other organizations.

Data

We use the National Center on Charitable Statistics (NCCS)-GuideStar National Non-

profit Research Database (“digitized data”) for our analyses. The database includes tax-exempt

nonprofit organizations that are required to file IRS Form 990 or 990-EZ between 1998 and

2003. Although there are other Form 990 datasets with a longer time span and more recent data,

the digitized data best suits the needs of this study by providing the most detailed financial infor-

mation across all datasets. Because we are testing for dirty data, we did not clean the data except

to exclude observations with unknown National Taxonomy of Exempt Entities (NTEE) codes

and include only those organizations filing in all six years in order to obtain a balanced panel of

organizations. We then draw a random sample of 1,000 organizations for our analyses,5 includ-

ing 6,000 organization-year observations and up to 214 financial variables for each organization.

Methods

We conduct digital analysis to the sample, that is, compare the observed first-digit fre-

quencies of the sample Form 990 data with the expected first-digit frequencies according to Ben-

ford’s Law. Benford’s Law is violated (and thus plausible misreporting occurs) when the

empirical distribution of first digits in our data differs from the first-digit distribution based on

Benford’s Law. We therefore test the joint null hypothesis:

against the two-sided alternative, where pE (d) is the empirical probability of the first digit equal-

ing d. Following convention, we exclude observations from our analysis whenever they take neg-

ative values, as the distribution of negative values is likely different from the distribution of posi-

tive values. First-digit analysis also exclude zeros, decimals, and missing values.

If the null hypothesis that the frequencies of the observed leading digits conform to the

Bedford’s distribution is rejected at the 5% level, then the organization warrants further investi-

gation. We test the null hypothesis using three alternative statistics: the Pearson chi-squared (χ2),

the modified Kolmogorov-Smirnov (KS) (Joenssen 2015), and the Freedman-Watson U2 (U2)

tests. Each test has advantages and shortcomings described in more detail in the appendix. The χ2

test is traditional in this literature, reported for comparability. The modified KS test is also con-

ventional in this literature because it is more powerful, that is, better able to detect a true rejec-

tion of the null hypothesis. However, neither of these tests is fully appropriate for Benford analy-

sis because they assume that we are testing whether the observed distributions are far from the

expected distributions along a line of numbers. This assumption incorrectly implies that the lead-

ing digits 9 and 1 are furthest from each other. Instead, we should be testing for distance around

a circle, because the leading digits move in order from 1 to 9 and then back to 1 again. The U2

test is specifically designed for circular data, and has additional desirable statistical properties

explained in the appendix. We believe we are among the first to apply this test to Benford analy-

sis. The U2 test is found to be more powerful than χ2 and KS tests in most cases (Lesperance et al.

2016), we therefore prefer the U2 test whenever the results of the three tests differ.

We also report a descriptive measure, the mean absolute deviation (MAD), to indicate the

magnitude of nonconformance. This measure calculates the sum of the absolute differences be-

tween the observed and expected frequencies of each leading digit, divided by nine. According to

Nigrini (2012), a MAD statistic ranging from 0 to 0.006 indicates close conformance to the Ben-

ford’s Law, 0.006 to 0.012 indicates acceptable conformance, 0.012 to 0.015 indicates margin-

ally acceptable conformance, and nonconformance if the MAD statistic is above 0.015.

We conduct the digital analysis at three levels. First, we test whether the entire sample

complies with the Benford distribution. Then, we test data at the variable level to see whether

selected variables violate Benford’s Law. Finally, we test the compliance of individual organiza-

tions.

In addition to the digital analysis, we test our hypothesis at the organizational level.

Specifically, we estimate the effects of philanthropic amateurism, public perception compliance,

and funder oversight on the magnitude of nonconformance at the organizational level using OLS

regressions with robust standard errors. The regressions control for organizational size, age, and

sector, so the empirical model takes the form:

Magnitude of Nonconformancei = + Accrual Accountingi + Accounting Feesi +

Officer Compensationi + Zero Fundraising Expensesi + Zero Management Expensesi +

Government Grantsi + Indirect Public Supporti + Donor Restrictionsi + Sizei + Agei +

Subsectori +

Operationally, the dependent variable is organizational MAD, calculated by pooling all

the reported numbers from all the years for each organization. Three indicator variables measure

philanthropic professionalism/amateurism. First, Accrual Accounting equals 1 if the organization

reported using accrual accounting all six years, 0 otherwise. Second, Accounting Fees equals 1 if

the organization reported positive accounting fees for all years, 0 otherwise. Third, Officer Com-

pensation equals 1 if the organization reported positive compensation of officers and/or directors

for at least one year, 0 otherwise. We predict negative coefficients for the first two measures, and

the coefficient for Officer Compensation could be either direction.

Two indicator variables are used to measure public perception compliance. The first indi-

cator, Zero Fundraising Expenses, equals 1 if the organization reported zero fundraising ex-

penses while receiving at least $10,000 in total contributions in at least one year, 0 otherwise

(following Krishnan et al. 2006). The second indicator, Zero Management Expenses, equals 1 if

the organization reported zero management expenses for at least one year, 0 otherwise. We pre-

dict positive coefficients for both indicators.

Three indicator variables measure funder monitoring. First, Government Grants equals 1

if the organization received positive government grants for at least one year, 0 otherwise. Sec-

ond, Indirect Public Support, equals 1 if the organization received a positive amount of indirect

public support in at least one year, 0 otherwise. Third, Donor Restrictions equals 1 if the organi-

zation received any temporarily or permanently restricted assets in at least one year, 0 otherwise.

We predict the coefficients of these three variables will be negative.

Finally, we include control variables for size, age, and subsector, following prior litera-

ture. Size is the natural log of an organization’s six-year average total expenses (we used alterna-

tive size measures six-year average total revenue and six-year average total assets for robustness

checks). Age is the organization’s age in 2003 (years since granted tax-exempt status), the latest

year in our data. We also control for the organization’s subsector using indicator variables for the

five major subsectors coded in the National Taxonomy for Exempt Entities – Arts, Culture and

Humanities; Education; Health; Human Services; and Other.

Results

Digital Analysis

Whole Sample Conformance. First, we test the overall conformance of the whole sample

by polling all reported numbers from all organizations and years (n = 427,329). The frequencies

of leading digits are very close to those predicted by the Benford distribution, with an average

deviation of only 0.07%. While χ2 (p = 0.0010) and KS (p = 0.049) statistics indicate non-con-

formance with Benford’s Law, the preferred U2 test (p = 0.2043) shows otherwise. We cannot

confidently reject the null hypothesis that the frequencies of first digits follow the Benford distri-

bution. This provides support that the law generally holds for the sample of nonprofit organiza-

tions, which is encouraging.

[Figure 1]

Variable Conformance. Certain variables are of particular interest to donors, watchdogs,

or scholars, including total contributions, program service revenue, total revenue, management

expenses, fundraising expenses, and total expenses. Our tests serve to triangulate the results of

other studies finding that some of these variables are misreported.

Noncompliance with Benford’s Law may indicate a problem with the financial reporting

guidance provided to organizations, rather than organizations’ intent to mislead. For example,

the instructions and guidance for allocating expenses between the program, fundraising, and

management expenses can be complicated and ambiguous, prompting some organizations to use

human judgment and approximation, rather than precise calculations. While such made-up num-

bers may often provide a reasonably accurate reflection of the NPO’s expense shares, it may

cause non-conformance with Benford’s Law.

By the preferred U2 test, we cannot reject the null hypothesis that the reported numbers

follow the Benford distribution for Total Contributions (p = 0.3172) and Management Expenses

(p = 0.2194). However, we can reject the null for Total Revenues (p = 0.0116), Fundraising Ex-

penses (p = 0.0438), Total Expenses (p = 0.0424), and perhaps Program Service Revenues (p =

0.0772). The MAD statistic shows that the deviations for these variables are small, ranging from

0.4% to 0.6%.

The other statistical tests do not always agree with U2 test. For example, for Fundraising

Expenses, we cannot reject the null hypothesis of conformance using the χ2 statistic (p = 0.3628),

but the U2 statistic indicates nonconformance. Similarly, we would reject the null for Total Con-

tributions using the χ2 statistic (p = 0.0256) while both KS and U2 statistics indicate the opposite.

If we are correct that U2 is a more appropriate test, this illustrates the possibility that conven-

tional tests produce misleading conclusions. We therefore suggest that users of digital analysis

include U2 testing in their toolkit and interpret results with caution when conventional tests dif-

fer.

Organizational Conformance. To test each organization’s conformance with Benford’s

Law, we pool all the reported numbers from all years for each organization. We obtained both

the test statistics and MAD for each organization. By the preferred U2 test, less than 20% of the

sample reported financial data that are compliant with Benford’s Law. These organizations have

a smaller MAD on average (n = 182, MAD = 0.015) than those that do not pass the U2 test (n =

818, MAD = 0.034 ) (Table 2). Results are similar using the other two tests. About 25% of the

sample conforms using the KS test, and less than 15% conform using the χ2 test.6 This is similar

to the findings from the financial reports of UK charities that only 25% of the sample conforms

to Benford’s Law using the KS test (Dang and Owens 2016). Table 2 reports the test p-values

and MAD statistics for the whole sample, selected variables, and organizations.

[Table 2]

Hypothesis Testing

We split the sample according to our hypotheses to see whether there are significant dif-

ferences in the MAD across subsamples. Table 3 reports the summary statistics and Table 4 re-

ports OLS regressions with robust standard errors.

Philanthropic amateurism. We use three different indicators of philanthropic amateur-

ism/professionalism. First, we compare organizations using the accrual accounting method with

those using other accounting methods, hypothesizing that the former has more accounting so-

phistication and are more likely to produce high-quality financial reports. We find that nonprofits

using cash or other accounting methods in any year (n = 411, MAD = 0.0384) depart from the

Benford’s distribution significantly more than those using accrual accounting every year (n =

589, MAD = 0.0244; p < 0.0001). Second, we compare organizations that reported positive ac-

counting fees with those that did not, based on the hypothesis that organizations hiring external

accountants are more likely to produce high-quality financial reports. This hypothesis is also

confirmed: nonprofits reporting positive accounting fees every year conform with the Benford

distribution (n = 406, MAD = 0.0265) significantly more than those reporting zero accounting

fees in at least one year (n = 594, MAD = 0.0326; p < 0.0001). Third, we report the differences

between organizations that have paid officers and/or directors and those reliant on unpaid volun-

teers, and find that those reporting positive compensation for at least one year (n = 568, MAD =

0.0241) are significantly more conformant than those reporting zero compensation every year (n

= 432, MAD = 0.0381, p < 0.0001).

Public perception compliance. Our two hypotheses related to image manipulation are

also confirmed. Organizations with suspiciously reported fundraising expenditures, those that re-

ported spending nothing and receiving at least $10,000 in donations in any year, are significantly

less conformant with Benford’s distribution (n = 601, MAD = 0.0311) than those that never re-

ported suspicious fundraising expenditures during the period (n = 399, MAD = 0.0287; p <

0.0001). The same is true for organizations reporting no administrative expenditures in any year

(n = 730; MAD = 0.0430) versus those reporting positive administrative expenses (n = 270;

MAD = .0254; p < 0.0001).

Funder oversight. We offered three hypotheses related to funder oversight, and all are con-

firmed. Conformance is significantly better for those organizations receiving government grants

for at least one year (n = 433, MAD = 0.0247) than for organizations that received no govern-

ment grants during the sampled years (n = 567, MAD = 0.0343; p < 0.0001). Conformance is

also significantly better for those organizations receiving indirect public support in at least one

year (n = 287, MAD = 0.0238) than for those that never received indirect public support (n =

713, MAD = 0.0327; p < 0.0001). Finally, conformance is also significantly higher for those or-

ganizations that received temporarily or permanently restricted donations in any year (n = 456,

MAD = 0.0228) than for those that never received restricted donations (n = 544, MAD = 0.0363;

p < 0.0001).

[Table 3]

Regression Results. The dependent variable is organizational MAD statistic. Each of our

hypotheses is tested by the coefficient on the corresponding indicator variable. We also control

for subsector using NTEE codes, age in 2003, and one of the three alternative measures of organ-

izational size. A negative coefficient indicates that the factor improves conformance with the

Benford distribution, and a positive coefficient indicates that conformance is lowered.

First, our hypotheses are confirmed for two out of three tests of philanthropic amateurism

– the use of accrual accounting method and the presence of paid officers. However, the impact of

positive accounting fees was small and not statistically significant. Second, for our public per-

ception compliance hypotheses, zero reported management expense is positively associated with

lower conformance to Benford’s Law. However, the association between zero fundraising ex-

penses and conformance is small and not significant. Third, the evidence supports our donor

oversight hypotheses in all cases.

Turning to the control variables, we find that larger organizations are more conformant

by two measures (total revenues and total expenses) but total assets have no significant effect.

Education, Human Service, and Other subsector organizations appear significantly less conform-

ant than Arts and Culture organizations (the excluded category), whereas there are small and in-

significant differences between Health and the excluded subsector. Age had a tiny effect on con-

formance that was significant in two specifications and borderline significant in the third.

[Table 4]

Discussion

Previous research in other sectors concludes that Benford’s Law is reliable and robust

when applied to aggregated numbers that have no natural ceiling or cut-off. In this study, we

demonstrate that digital analysis can be used to test the conformance of nonprofit financial data

with Benford’s Law at different levels: across the whole sample, by selected variables, and by

individual organizations. We recommend that users of Benford analysis include U2 testing in

their toolkit and interpret results with caution if conventional tests differ from U2 test. By the

preferred U2 test, we cannot confidently reject the null hypothesis that the observed first-digit fre-

quencies follow the Benford distribution for the whole sample, suggesting that the reported num-

bers generally follow Benford’s law. We find that deviations from Benford’s Law are larger and

more frequent for organizations that have lower accounting and management sophistication,

questionable functional expense reporting, and weaker funder oversight. This pattern of devia-

tions follows our hypotheses, bolstering the case that violations of Benford’s Law signal the mis-

reporting of data. Our analysis for selected variables suggests that violations appear to be more

prevalent for functional expense reporting where guidelines are more complicated to interpret

and follow, and this finding provide further support for Burger et al (2017) that nonconformance

with Benford’s Law is higher when there is an increased reporting burden.

Similar to Dang and Owens (2016), we find that the majority of organizations in our sam-

ple do not comply with Benford’s Law. The IRS cannot audit the majority of nonprofits, thus

sharper discernment will be required. Additionally, our hypothesis analysis showed that devia-

tions from Benford’s Law are associated with not only intentional manipulations but also ama-

teurism and it would be important to examine ways to better distinguish between the two types of

deviations.

Therefore, we suggest that future research provides appropriate modification to the MAD

threshold for noncompliance. Based on an analysis of 25 data sets, Nigrini (2012) labels organi-

zations as noncompliant whenever organizational MAD exceeds 0.015, but this threshold may

not generalize to nonprofit data. A higher threshold may be appropriate for regulators, attorneys

general, and charitable watchdogs. We also recommend using a broader set of forensic analytic

such as the last two digits test and number duplication test (Nigiri, 2012). The diversification of

tools will also ensure that regulators cannot be easily duped by more sophisticated manipulators

who fabricate data that conforms with the Benford distribution.

Conclusion

There is a longstanding belief that public nonprofit organizations, which invest in com-

munities, care for the fragile, and protect the weak, are clearly doing good and therefore should

be exempt from scrutiny. While there has been increased pressure to monitor and report, non-

profits want to insure that new requirements not burden them to the detriment of their core pur-

poses.

In general, our results support inclusion of Benford analysis in regulatory toolkits. We

find that the patterns of violation are consistent with factors expected to increase reporting prob-

lems. This suggests that this forensic tool is picking up meaningful patterns in nonprofit financial

reporting. However, we also find that Benford analysis of nonprofit financial information flags a

high number of suspicious cases, which suggests that it may include some false positive results.

In order to help the government use its limited resources more effectively, the ideal screen would

flag a small number of cases that are true positives (violations), minimize the number of false

positives (so that few resources are devoted to investigating the excusable), and minimize the

number of false negatives (so that few wrongdoers escape notice). Thus, it requires further re-

finement for Benford’s analysis to function as an effective screening tool for egregious misre-

porting. Benford screening can also be enhanced by combining with other forensic analytical

tools.

Enhanced screening will improve the credibility and reliability of nonprofit financial in-

formation and benefit all users of this information – government contractors, government regula-

tors, and donors. More reliable numbers are expected to enhance the accountability reputation of

the nonprofit sector. Although we focus on nonprofit tax return data, Benford’s Law can also be

applied to many other kinds of government data and is relevant to a broader discussion on the

importance of organizational transparency and reputation in public arena.

Notes

1) The current filing threshold is gross receipts exceeding $50,000, but the threshold was

$25,000 during the years of our data. Most political organizations, as well as churches and cer-

tain related religious institutions are not required to file form 990, and private foundations (ex-

cluded from this study) file annual forms 990 PF.

2) False positives are discussed later in this article. For false negatives, there is an online debate

over whether fraudster Bernard Madoff filed Benford-compatible monthly returns. Kedrosky

(2009) found that data were compliant, but Falkenstein (2008) challenged his methods. It appears

that the original Kedrosky (2009) blog has been removed from his site (hence this is not cited in

the references) although he continues to cite it in Kedrosky (2009). Because of its reliance on

data to detect non-conforming patterns, Benford’s Law will also fail to detect fraud that involves

the omissions of transactions or non-reporting.

3) Specifically, when naturally-created data is combined with other naturally-created data in

many steps involving addition, subtraction, multiplication, and division, the distribution of first

digits approaches the Benford distribution as the number of steps grows, regardless of the distri-

bution of the uncombined data.

4) Further, his analysis shows that deviations were higher for small firms and those reliant on

grants. These findings plausibly relate to his focus on second digits and consequent rounding.

5) We limit the sample size because bootstrap is computationally intensive. Although the panel is

balanced at the organization level, organizations vary in reporting on individual variables each

year, so the number of reported variables used to create organization-specific compliance statis-

tics varies across organizations.

6) By all three statistical tests, 71% of the sample are flagged as suspicious and about 11% are

compliant to the Benford’s Law. Identification of suspicious cases varies across statistical tests

for 18% of the sample. For example, 92 cases are flagged as suspicious by U2 test but not KS

(MAD = 0216). On the other hand, 19 cases are flagged as suspicious by KS but not U2 test

(MAD = .0180).

References

Archambault, Jeffrey J., and Marie E. Archambault. 2011. "Earnings Management among Firms

during the Pre-SEC Era: A Benford's Law Analysis." Accounting Historians Journal 38,

no. 2: 145-170.

Baber, William R., Patricia L. Daniel, and Andrea A. Roberts. 2002. ”Compensation to Manag-

ers of Charitable Organizations: An Empirical Study of the Role of Accounting Measures

of Program Activities." The Accounting Review 77, no. 3: 679-693.

Benford, Frank. 1938. "The Law of Anomalous Numbers." Proceedings of the American philo-

sophical society: 551-572.

Burger, Ronelle, Canh Thien Dang, and Trudy Owens, 2017. “Better Performing NGOs do Re-

port More Accurately: Evidence from Investigating Ugandan NGO Financial Accounts.”

Nottingham: CREDIT, Centre for Research in Economic Development and International

Trade, working paper No. 17/10. https://www.nottingham.ac.uk/credit/documents/pa-

pers/2017/17-10.pdf

Burks, Jeffrey J. 2015. "Accounting Errors in Nonprofit Organizations." Accounting Hori-

zons 29, no. 2: 341-361.

Collins, J. Carlton. 2017. "Using Excel and Benford's Law to Detect Fraud: Learn the Formulas,

Functions, and Techniques That Enable Efficient Benford Analysis of Data Sets." Jour-

nal of Accountancy 223, no. 4: 44.

Conover, William J. 1972. "A Kolmogorov Goodness-of-Fit Test for Discontinuous Distribu-

tions." Journal of the American Statistical Association 67, no. 339: 591-596.

David, Kane. 2011. “Benford’s Law and Charity Data.”

https://blogs.ncvo.org.uk/2011/09/21/benfords-law-and-charity-data.

Dang, Canh Thien and Trudy Owens, 2016. “How Accurate Are Financial Reports Of British

Charities?” Nottingham: CREDIT, Centre for Research in Economic Development and

International Trade, working paper No. 16/05. https://www.nottingham.ac.uk/credit/do-

cuments/papers/2016/16-05.pdf

De Marchi, Scott, and James T. Hamilton. 2006. "Assessing the Accuracy of Self-Reported Data:

An Evaluation of the Toxics Release Inventory." Journal of Risk and Uncertainty 32 (1):

57-76.

Durtschi, Cindy, William Hillison, and Carl Pacini. 2004. "The Effective Use of Benford's Law

to Assist in Detecting Fraud in Accounting Data." Journal of Forensic Accounting 5 (1):

17.

Falkenstein, Eric. 2008. “Benford’s Law Catches Madoff (error!).” http://falkenblog.blog-

spot.com/2008/12/benfords-law-catches-madoff.html.

Finn, Arden and Vimal Ranchhod. 2017. "Genuine Fakes: The Prevalence and Implications of

Data Fabrication in a Large South African Survey." World Bank Economic Review 31

(1): 129-157.

Froelich, Karen A., Terry W. Knoepfle, and Thomas H. Pollak. 2000. "Financial Measures in

Nonprofit Organization Research: Comparing IRS 990 Return and Audited Financial State-

ment Data." Nonprofit and Voluntary Sector Quarterly 29 (2): 232-254.

Galbi, Douglas. 2012. “Non-profits' Distribution of Management Expenses.” https://www.pur-

plemotes.net/2012/03/25/non-profits-distribution-of-management-expenses/.

Garven, Sarah A., Mary Ann Hofmann, and Dwayne N. McSwain. 2016. “Playing the Numbers

Game.” Nonprofit Management and Leadership 26, no. 4: 401-416.

Hansmann, Henry. 1996. "The changing roles of Public, Private, and Nonprofit Enterprise in Ed-

ucation, Health Care, and Other Human Services." In Individual and Social Responsibil-

ity: Child Care, Education, Medical Care, and Long-term Care in America, edited by

Victor R. Fuchs, 245-276. University of Chicago Press.

Hill, Theodore P. 1995. “A Statistical Derivation of the Significant-Digit Law.” Statistical Sci-

ence 10(4): 354-363.

Hill, Theodore P. 1998. “The First Digital Phenomenon.” American Scientist. 86(4):358-363.

IRS. 2017. “Instructions for Form 990 Return of Organization Exempt from Income Tax”. Ac-

cessed July 3, 2017. https://www.irs.gov/pub/irs-pdf/i990.pdf.

Joenssen, Dieter William. 2015. “BenfordTests: Statistical Tests for Evaluating Conformity to

Benford’s Law." https://rdrr.io/cran/BenfordTests/

Jones, Christopher L., and Andrea Alston Roberts. 2006. "Management of Financial Information

in Charitable Organizations: The Case of Joint-Cost Allocations." The Accounting Re-

view 81 (1): 159-178.

Kane, David. 2011. “Benford’s Law and Charity Data.”. NCVO (blog).

https://blogs.ncvo.org.uk/2011/09/21/benfords-law-and-charity-data/

Keating, Elizabeth K., and Peter Frumkin. 2003. "Reengineering Nonprofit Financial Accounta-

bility: Toward a More Reliable Foundation for Regulation." Public Administration Re-

view 63 (1): 3-15.

Keating, Elizabeth K., Linda M. Parsons, and Andrea Alston Roberts. 2008. "Misreporting Fun-

draising: How do Nonprofit Organizations Account for Telemarketing Campaigns?" The

Accounting Review 83 (2): 417-446.

Kedrosky, Paul. 2009. “Madoff’s Results Really were Random” Seeking Alphaα (blog), at

https://seekingalpha.com/article/173294-madoffs-results-really-were-random.

Krishnan, Ranjani, and Michelle H. Yetman. 2011. "Institutional Drivers of Reporting Decisions

in Nonprofit Hospitals." Journal of Accounting Research 49 (4): 1001-1039.

Krishnan, Ranjani, Michelle H. Yetman, and Robert J. Yetman. 2006. "Expense Misreporting in

Nonprofit Organizations." The Accounting Review 81 (2): 399-420.

Lesperance, M., W. J. Reed, M. A. Stephens, C. Tsao, and B. Wilton. 2016. "Assessing Confor-

mance with Benford's Law: Goodness-of-Fit Tests and Simultaneous Confidence Inter-

vals." PloS One 11 (3): e0151235.

Mayer, Lloyd Hitoshi. 2016. "The Rising of the States in Nonprofit Oversight." Nonprofit Quar-

terly. https://nonprofitquarterly.org/2016/08/11/rising-states-nonprofit-oversight/.

Myagkov, Mikhail, Peter C. Ordeshook, and Dimitry Shakin. 2005. "Fraud Or Fairytales: Russia

and Ukraine's Electoral Experience." Post-Soviet Affairs 21 (2): 91-131.

Newcomb, Simon. 1881. "Note on the Frequency of Use of the Different Digits in Natural Num-

bers." American Journal of Mathematics 4 (1): 39-40.

Nigrini, Mark J. 1996. “Taxpayer Compliance Application of Benford’s Law.” Journal of the

American Taxation Association. 18(1):72-92.

Nigrini, Mark J. 2012. Benford’ s Law: Applications for Forensic Accounting, Auditing and

Fraud Detection. Vol. 586. New Jersey: John Wiley & Sons,

Omer, Thomas C. and Robert J. Yetman. 2003. "Near Zero Taxable Income Reporting by Non-

profit Organizations." The Journal of the American Taxation Association 25 (2): 19-34.

Omer, Thomas C. and Robert J. Yetman. 2007. "Tax Misreporting and Avoidance by Nonprofit

Organizations." Journal of the American Taxation Association 29 (1): 61-86.

Rauch, Bernhard, Max Göttsche, Gernot Brähler, and Stefan Engel. 2011. "Fact and Fiction in

EU‐Governmental Economic Data." German Economic Review 12 (3): 243-255.

Salamon, L. M. 1987. "Partners in Public Services: The Scope and Theory of Government.”

In The Nonprofit Sector: A Research Handbook, edited by Walter Powell, 99-117. New

Haven: Yale University Press.

Schräpler, Jörg-Peter. 2011. "Benford's Law as an Instrument for Fraud Detection in Surveys

using the Data of the Socio-Economic Panel (SOEP)." Jahrbücher Für Nationalökonomie

Und Statistik / Journal of Economics and Statistics 231 (5/6): 685-718.

Steinberg, Richard. 2010. “Principal-Agent Theory and Nonprofit Accountability." Comparative

Corporate Governance of Non-Profit Organizations: 73-125.

Tam Cho, Wendy K., and Brian J. Gaines. 2007. "Breaking the (Benford) Law: Statistical Fraud

Detection in Campaign Finance." The American Statistician 61 (3): 218-223.

Tinkelman, Daniel. 1999. “Factors Affecting the Relation between Donations to Not-for- profit

Organizations and an Efficiency Ratio. Research in Government and Nonprofit Account-

ing, 10: 135-161.

Trussel, John. 2003. "Assessing Potential Accounting Manipulation: The Financial Characteris-

tics of Charitable Organizations with Higher than Expected Program-Spending Ra-

tios." Nonprofit and Voluntary Sector Quarterly 32 (4): 616-634.

United Way. (n.d.). “Accountability.” Accessed October 21, 2017. https://www.united-

way.org/about/public-reporting/accountability.

Urban Institute and Center on Philanthropy at Indiana University. 2004. “What We Know about

Overhead Costs in the Nonprofit Sector.” Nonprofit Overhead Cost Study, Brief No. 1.

https://www.urban.org/sites/default/files/publication/57576/310930-What-We-Know-

about-Overhead-Costs-in-the-Nonprofit-Sector.PDF.

Van Caneghem, Tom. 2016. ”NPO Financial Statement Quality: An Empirical Analysis Based

on Benford’s Law." VOLUNTAS: International Journal of Voluntary and Nonprofit Or-

ganizations 27, no. 6: 2685-2708.

Van Slyke, David M. 2007. "Agents or Stewards: Using Theory to Understand the Government-

Nonprofit Social Service Contracting Relationship." Journal of Public Administration

Research and Theory 17, no. 2: 157-187.

Varian, Hal. R., 1972. “Benford’s Law.” The American Statistician. 25, 65–66.

Wing, Kennard, Teresa Gordon, Mark Hager, Thomas Pollak, and Patrick Rooney. 2006. "Func-

tional Expense Reporting for Nonprofits: The Accounting Profession's Next Scandal?" The

CPA Journal 76 (8): 14.

Yetman, Michelle H., and Robert J. Yetman. 2013. "Do Donors Discount Low-Quality Account-

ing Information?" Accounting Review 88 (3): 1041-1067.

Appendix: Statistical Tests

If we treat the observed first digits as a set of unordered categories, the Pearson chi-

squared test is appropriate. However, this test neglects the fact that the categories are ordered.

The Kolmogorov-Smirnov test is well-known and often employed in Benford studies. It accounts

for the ordering of the Benford distribution and tests the equality of an empirical distribution

with a specified continuous distribution. The test is nonparametric when applied to continuous

distributions. However, because the null (Benford) distribution is discrete, Kolmogorov-Smirnov

p-values are conservative and depend on the parameters of the null distribution (Conover, 1972).

We therefore use the modified Kolmogorov-Smirnov test (Joenssen, 2015). To avoid numerical

instability issues, we use empirically simulated standard errors. The modified Kolmogorov-

Smirnov test is powerful when there is a large violation of Benford’s Law over a small portion of

the support (range of possible values) of the statistical distribution and less powerful for small

deviations spread throughout the support. This is because the test statistic is based on whichever

digit has the largest deviation between empirical and null distributions, rather than combining the

deviations of all digits.

Although the modified Kolmogorov-Smirnov test accounts for the ordering of the digits,

it does not account for the circularity of this ordering. If we increase the value of a number with

first digit 9, the result is a number with one more digit and a leading digit of 1. The digit 2 is to

the right of the digit 1, the digit 3 is to the right of 2, …, and the digit 1 is to the right of 9. The

Freedman-Watson U2 test is specifically designed for distributions with a circular support. The

U2 statistic does not depend on labeling the minimum of a support, because there is no minimum

on a circle. This is unlike the Kolmogorov-Smirnov test, which relies on cumulative distributions

that would be different if we started with a 2 or a 3. In addition, the U2 test incorporates all the

deviations in frequencies of digits, not just the largest deviation. The continuous Watson U2 test

is known to be more powerful than the continuous Kolmogorov-Smirnov test when the devia-

tions from Benford’s frequencies are spread throughout the support. Analytic results are not

available for the discrete versions, but Monte Carlo simulations by Lesperance et al. (2016) find

that the Freedman-Watson U2 test is more powerful than Pearson chi-square and Kolmogorov-

Smirnov tests (except when deviations are expected to be concentrated on larger values of the

first significant digit and then Pearson’s chi-square statistic is superior). Because we have no

prior beliefs on the distribution of deviations, we prefer the U2 test whenever the results of the

three tests differ.

Figure 1 Whole Sample Compliance with Benford’s Law

First Digit Percent observed

from the sample

Percent expected

from Benford’s Law

1 30.230 30.103

2 17.547 17.609

3 12.606 12.494

4 9.569 9.691

5 8.007 7.918

6 6.674 6.695

7 5.792 5.799

8 5.086 5.115

9 4.489 4.576

Table 1 Variable Definitions

Variables Definitions

Professionalism

Accrual accounting Dummy variable, 1 if an organization used accrual ac-

counting method for all years (line J), 0 otherwise.

External professional accountant Dummy variable, 1 if an organization reported positive

accounting fees for all years (line 31), 0 otherwise.

Professional management Dummy variable, 1 if an organization reported positive

compensation of officers/directors (line 25) for at least

one year, 0 if reported 0 compensation for all years.

Public Perception Compliance

Zero fundraising expenses Dummy variable, 1 if an organization reported zero fund-

raising expenses (line 15) while total contributions (line

1d) ≥ 10000 for at least one year, 0 otherwise

Zero management expenses Dummy variable, 1 if an organization reported zero man-

agement expenses (line 14) for at least one year, 0 if

never reported zero management expenses for any year.

Funder oversight

Government grants Dummy variable, 1 if an organization reported positive

government grants (line 1c) for at least one year, 0 if

never reported government grants for any year.

Indirect public support Dummy variable, 1 if an organization reported indirect

public support (line 1b) for at least one year, 0 if never

reported indirect support for any year.

Restricted donations Dummy variable, 1 if an organization reported temporar-

ily or permanently restricted donations (line 68, 69) for at

least one year, 0 if never reported restricted donations for

any year.

Control variables

Size Natural log of an organization’s six-year average total

revenue, expenses, or assets.

Age Age in 2003.

Subsector NTEE Major Subsector: 4 dummy variables for Arts,

Culture, and Humanities; Education; Health; Human Ser-

vices; and Other (excluded category).

Table 2 Benford Analysis for the Whole Sample, Selected Variables, and Organizations

MAD U2 p-value KS p-value χ2 p-value N

Whole Sample 0.0007 0.2043 0.049 0.0010 427329

Variables

Total Contributions 0.0053 0.3172 0.1149 0.0256 5032

Program Service Revenue 0.0056 0.0772 0.0721 0.0461 3828

Total Revenue 0.0052 0.0116 0.0064 0.0324 5953

Program Service Expenses 0.0044 0.0072 0.0251 0.1208 5365

Management Expenses 0.0047 0.2194 0.3988 0.2106 4810

Fundraising Expenses 0.0057 0.0438 0.0712 0.3628 1984

Total Expenses 0.0043 0.0424 0.3306 0.0875 5981

Organizations Conformane (p>0.05) Non-conformance (p<=0.05)

Average MAD statistic N Average MAD statistic N

U2 p-value 0.0152 182 0.0335 818

KS p-value 0.0173 255 0.0345 745

χ2 p-value 0.0136 144 0.0329 856

Table 3 Summary Statistics

Variable Mean Median Std.Dev Min Max N

Organizational

MAD 0.02899 0.02534 0.01587 0.00634 0.12540 966

Age in 2003 23.3 20 14.8 2 75 966

Average total

revenue 3,695,707 464,940.8 19,500,000 10,629.67 429,000,000 966

Average total

expenses 3,422,326 416,458.3 19,100,000 0 391,000,000 966

Average total

assets 6,653,499 564,138.50 27,300,000 -2,456.17 341,000,000 966

Organizational MAD by Indicators

Indicators Mean Median N

Used accrual accounting

No 0.0367563 0.0324955 388

Yes 0.0237825 0.0214654 578

Paid accounting fees

No 0.0308757 0.0263583 565

Yes 0.0263417 0.024258 401

Paid officers

Never 0.0361845 0.0325071 407

At least one year 0.0237579 0.0208054 559

Reported $0 fundraising expenses (contributions >=10000)

Never 0.0272043 0.0231083 385

At least one year 0.0301792 0.0265947 581

Reported $0 management expenses

Never 0.0252536 0.0227743 722

At least one year 0.0400601 0.036391 244

Received government grants

Never 0.0325104 0.0281157 537

At least one year 0.0245913 0.0230038 429

Received indirect support

Never 0.0312527 0.026838 681

At least one year 0.0235954 0.021413 285

Had donor restricted funds

Never 0.0344928 0.0312172 513

At least one year 0.0227658 0.0204772 453

Subsectors

Arts 0.0280258 0.0254975 82

Education 0.029817 0.026838 133

Health 0.0248897 0.0221599 162

Human Services 0.0294217 0.0261241 371

Other 0.0311762 0.0247028 218

Table 4 OLS Regressions on Organizational MAD

(1) (2) (3)

Received government

grants (≥ 1 year)

-0.00234*** -0.00231***

-0.00272****

(-2.89) (-2.90) (-3.32)

Received indirect sup-

port (≥ 1 year)

-0.00220*** -0.00210**

-0.00241***

(-2.66) (-2.54) (-2.85)

Donor restricted funds

(≥ 1 year)

-0.00398**** -0.00421****

-0.00447****

(-4.08) (-4.40) (-4.33)

Accrual accounting -0.00441**** -0.00412****

-0.00599****

(-4.60) (-4.33) (-6.16)

Accounting fees -0.000767 -0.000802 -0.000753

(-1.02) (-1.07) (-0.96)

Paid officers

(≥ 1 year)

-0.00612**** -0.00590****

-0.00701****

(-6.94) (-6.65) (-7.80)

Reported $0 fundrais-

ing (≥ 1 year)

0.00105 0.000996 0.000955

(1.23) (1.17) (1.10)

Reported $0 manage-

ment (≥ 1 year)

0.00564**** 0.00554****

0.00683****

(4.72) (4.60) (5.56)

Education subsector 0.00332** 0.00344**

0.00207

(2.28) (2.38) (1.41)

Health subsector 0.00185 0.00207 0.000147

(1.30) (1.46) (0.10)

Human Services sub-

sector

0.00461**** 0.00474****

0.00387***

(3.78) (3.90) (3.14)

Other subsectors 0.00529**** 0.00527****

0.00484****

(3.73) (3.72) (3.38)

Age in 2003 0.0000743*** 0.0000768***

0.0000538*

(2.66) (2.77) (1.82)

Average total revenue

(natural logs)

-0.00185****

(-6.54)

Average total expenses

(natural logs)

-0.00194****

(-7.27)

Average total assets

(natural logs)

-0.000289

(-1.25)

Constant 0.0559**** 0.0566****

0.0384****

(16.45) (17.31) (13.74)

N 966 967 966

R2 0.370 0.376 0.346

* p < 0.10, ** p < 0.05, *** p < 0.01, **** p < 0.001. t statistics in parentheses