36
http://aas.sagepub.com Administration & Society DOI: 10.1177/0095399704263473 2004; 36; 131 Administration & Society Daniel W. Williams Evolution of Performance Measurement Until 1930 http://aas.sagepub.com/cgi/content/abstract/36/2/131 The online version of this article can be found at: Published by: http://www.sagepublications.com can be found at: Administration & Society Additional services and information for http://aas.sagepub.com/cgi/alerts Email Alerts: http://aas.sagepub.com/subscriptions Subscriptions: http://www.sagepub.com/journalsReprints.nav Reprints: http://www.sagepub.com/journalsPermissions.nav Permissions: http://aas.sagepub.com/cgi/content/refs/36/2/131 Citations at Cape Breton University Library on December 5, 2008 http://aas.sagepub.com Downloaded from

Evolution of Performance Measurement

Embed Size (px)

DESCRIPTION

belag

Citation preview

Page 1: Evolution of Performance Measurement

http://aas.sagepub.com

Administration & Society

DOI: 10.1177/0095399704263473 2004; 36; 131 Administration & Society

Daniel W. Williams Evolution of Performance Measurement Until 1930

http://aas.sagepub.com/cgi/content/abstract/36/2/131 The online version of this article can be found at:

Published by:

http://www.sagepublications.com

can be found at:Administration & Society Additional services and information for

http://aas.sagepub.com/cgi/alerts Email Alerts:

http://aas.sagepub.com/subscriptions Subscriptions:

http://www.sagepub.com/journalsReprints.navReprints:

http://www.sagepub.com/journalsPermissions.navPermissions:

http://aas.sagepub.com/cgi/content/refs/36/2/131 Citations

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 2: Evolution of Performance Measurement

10.1177/0095399704263473ADMINISTRATION & SOCIETY / May 2004Williams / EVOLUTION OF PERFORMANCE MEASUREMENT

EVOLUTION OF PERFORMANCEMEASUREMENT UNTIL 1930

DANIEL W. WILLIAMSBaruch College

Performance measurement originated at the early Bureau of Municipal Research. Over thenext quarter century, it became more sophisticated through increased quantification and re-liance on experts. However, its focus narrowed from government to government service. Thisnarrowing is linked to reduced social activism among those who used these methods. The en-tire period saw combined interest in accomplishing results and containing costs. Leading ad-vocates of measurement included Lent Upson, Clarence Ridley, Mabel Walker, and EdisonCramer. Ridley became the executive director of the International City Manager’s Associa-tion where he continued to promote performance measurement for the next 30 years.

Keywords: performance measurement; productivity; outcomes; program evaluation;scorecards; performance budgeting; governmental cost accounting; surveys;experts

As Henry Petroski has shown, even such humble objects as forks andpaperclips evolve in response to practical environmental forces and thecreativity of dissatisfied or merely imaginative users (Petroski, 1992). Isthis true also of practices such as performance measurement? Who werethe critical inventors, and how did they influence its development? Whatwas added and what discarded?

This article examines the development of performance measurementin the critical period from its origins through 1930. It examines the link to

131

AUTHOR’S NOTE: I would like to thank Romuald Litwin, Lynn Wang, Jeanette Ellis,Frederick Lane, Lynne Weikart, James Guyot, Ray Oman, Denise Wells, Hindy LauerSchachter, participants in the Baruch faculty research seminar, participants at the 1999regional conferences of the American Society for Public Administration and the AmericanPolitical Science Association and at the delayed 2001 National Association for Budgetingand Financial Management conference, and the editors and reviewers for their comments,insight, and assistance. Errors are my own.

ADMINISTRATION & SOCIETY, Vol. 36 No. 2, May 2004 131-165DOI: 10.1177/0095399704263473© 2004 Sage Publications

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 3: Evolution of Performance Measurement

forerunners including the social survey, cost accounting, and Europeancollection of municipal statistics and codevelopments including scientificmanagement and the U.S. collection of municipal statistics at the CensusBureau. The primary purpose of this article is to examine the developmentof performance measurement after its origin through these critical forma-tive years. In 1910, performance measurement was embedded in a broaderset of practices called municipal research. By 1930, performance mea-surement was a distinctive activity. In the interim, its focus narrowed fromgovernment to government service and its primary purpose shifted frompolitical accountability to management effectiveness.

The second section of the article describes the state of prototypical per-formance measurement in roughly 1910. In the third section, there is anextensive discussion exposing the particular practices as they occur overthe ensuing 20 years. These developments are examined and explained inthe fourth section. The fifth section is a brief conclusion.

THE STATE OFPERFORMANCE MEASUREMENT IN 1910

The first extended implementation of prototypical performance mea-surement practices arose at the New York Bureau of Municipal Research(NYBMR) after 1906 (D. Williams, 2002, 2003). Although there werenumerous antecedents to these practices, three particulars stand out: thesocial survey of the settlement houses, prior developments in munici-pal statistics, and the then-recent advances of cost accounting. TheNYBMR’s research activities constitute prototypical performance mea-surement for two reasons. First, as with their modern descendents, theywere focused on the efficiency and effectiveness of government. Theyfocus on linking resources to intended governmental objectives (what isnow called performance budgeting), results of governmental effort (out-comes), objectively chosen expectations (benchmarks), and fixing theorganization to do better (productivity improvement).

Second, the NYBMR’s practices are, as explored in this article, the his-torical antecedents of current performance measurement practices. It isnot uncommon for people to cite Clarence Ridley and Herbert Simon’sMeasuring Municipal Activities (1938, 1943, 1948a) or their specifica-tions for municipal reports (Ridley & Simon, 1948b) as the beginning ofmodern performance measurement (Bouckaert, 1992; Ehrenhalt, 1994).These publications descend from Ridley’s 1927 dissertation, Means of

132 ADMINISTRATION & SOCIETY / May 2004

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 4: Evolution of Performance Measurement

Measuring Municipal Government (Ridley, 1927a) and the nearly identi-cal Measuring Municipal Government (Ridley, 1927b), his teaching atUniversity of Chicago, and his work at the International City Manager’sAssociation (Augier & March, 2001). In the forward to his dissertation,Ridley acknowledged the assistance of Lent Upson, the director of theDetroit Bureau of Governmental Research and a 1912 graduate of theTraining School for Public Service, which was part of the NYBMR (Gov-ernment Research Association, 1933; Ridley, 1927a). More than a fifth ofRidley’s (1927a, 1927b) citations in Means of Measuring Municipal Gov-ernment are linked to the NYBMR. And, most telling, the four categoriesRidley (1927a, 1927b) explicitly treats in Means of Measuring MunicipalGovernment were all similarly treated in 1912 by Henry Bruere (1912b),the original director of the NYBMR, in New City Government.

D. Williams (2002) showed that the roots of the NYBMR’s practicesare primarily the survey, municipal statistics, and cost accounting.

The survey. In the decades ending the 19th century, Jane Addams andFlorence Kelley, leaders of the U.S. settlement house movement, im-ported Charles Booth’s social survey to discover facts about poverty(Converse, 1987; Sklar, 1991). The social survey was a method to gatherdetailed data about small areas. Data analysis used qualitative devicessuch as coded maps to reveal demographic information. Booth’s surveysof London are generally treated as the paradigm shift that prepared theway for modern social research. Henry Bruere, the original director of theNYBMR, was directly associated with the settlement houses. William H.Allen and Frederick A. Cleveland, his two codirectors after a 1907 reorga-nization, were indirectly associated with the settlement houses throughtheir involvement with the Association for the Improving the Condition ofthe Poor (AICP) (Dahlberg, 1966; Kahn, 1997).

Municipal statistics. Collection and analysis of statistics originated inthe 1660s as the study of state facts (Porter, 1986). These practices weremerged with the study of probability in the 1800s and became a generalscience of inductive method about 1900 (Porter, 1986; Stigler, 1986). Thecollection of quantitative social facts flourished during the late 1800s. Itis in this context that the collection of municipal statistics developed inEurope. John Fairlie (1899, 1901, 1908), a prominent U.S. political sci-entist, communicated these European developments to the American au-dience.

In this early period, governmental, demographic, and commercial datawere mixed together; the same report might contain information aboutgovernment expenditures, births, and tonnage at the local port. By 1900,

Williams / EVOLUTION OF PERFORMANCE MEASUREMENT 133

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 5: Evolution of Performance Measurement

the governmental data sometimes looked like precursors of performancemeasurement, showing expenditures for various specific activities of gov-ernment normalized by population.1 In the United States, national spon-sorship for collecting municipal statistics began in earnest in 1898 at theDepartment of Labor and was transferred to the newly formed CensusBureau in 1902 (Fox, 1977; Hanger, 1901; Meyer, 1910).

Cost accounting. Modern cost accounting spread in the United Statesfollowing Captain Henry Metcalfe’s 1885 text, Cost of Manufactures(Garner, 1954; Previts & Merino, 1979). Cost accounting associates costswith those factors that lead to them and with the ultimate uses to whichthey contribute. Partly to achieve cost accounting objectives, FrederickClow (1896) adopted Adolf Wagner’s functional model of government forbudget and accounting categories (Fox, 1977). The National MunicipalLeague (NML) promoted uniform accounting based on Clow’s function-alism (Fox, 1977; Hartwell, 1901). Then NML influenced the early Cen-sus Bureau to promote uniform municipal accounting across the country.The Census Bureau also tried to associate financial statistics (costs) withdata on service provision, which it called physical statistics, but achievedlittle success (Cummings, 1913; Fox, 1977; Meyer, 1910; Willoughby,1910).

The NYBMR combined these antecedents and codevelopments toempirically investigate government. It sought to promote a competent andhierarchical executive branch of government, retain a decision-makingrole for the legislature, and assist the public to better participate in democ-racy. The NYBMR also sought to expand governmental capacity whileslowing the growth of, or even shrinking, taxation. Prototypical perfor-mance measurement practices uniquely produce an opportunity to meetthese many objectives (D. Williams, 2002, 2003).

Early performance measurement practices included three sorts of mea-surements. First, the NYBMR sought to reform budgeting and accountingpractices so that costs could be clearly associated with specific activitiesof government. This effort was an implementation of the NML’s agenda.Second, the NYBMR sought to develop specific real-time records of workperformance so that activity and output could be clearly associated withcosts. This effort was an extension of the NML and Census Bureau’sefforts and was influenced by scientific management after 1910. Third,the NYBMR measured social conditions, sometimes focused on needsassessment and sometimes on outcomes. This measurement was an exten-sion of the settlement house practices (D. Williams, 2003).

134 ADMINISTRATION & SOCIETY / May 2004

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 6: Evolution of Performance Measurement

Early performance measurement practices fulfilled two objectives.First, they communicated information to managers to show the nature andstatus of work completed and contributed to improved work productivity.Second, they supported budgetary decision making by revealing informa-tion about needs and program effectiveness (D. Williams, 2003).

An adequate account of the initial performance measurement practicesrequires a brief comment on the political context. Under the Federalistgovernments of George Washington and John Adams, the United Statesadopted a strong executive model of government following the advice ofAlexander Hamilton. The administration responded to the president. Therole of the legislature was to express the will of the people. Beginning withthe election of Thomas Jefferson, there was a gradual erosion of presiden-tial power in favor of Congress, which came to dominate the adminis-tration (White, 1948, 1951, 1954, 1958).

By the late 1800s, Jeffersonian government was widespread through-out the United States (White, 1954, 1958). However, shortcomings in leg-islative control of local government were apparent. Patronage, logrolling,and corrupt granting of franchises were a few of these shortcomings(Maxey, 1919; White, 1958; D. Williams, 2002). Advocates of a return toHamiltonian practices were gaining strength. However, the fear of dicta-torial power and a loss of responsiveness to the public retarded this returnto Hamiltonian principles.

It is in this context that municipal research and prototypical perfor-mance measurement took root. Performance measurement-like practiceswere promoted as an improvement in efficiency and in transparency.Transparency was the vehicle to overcome resistance to executive gov-ernment. Through budgeting, cost accounting, surveys, and reporting, theexecutive would be held accountable to the legislature and the public(D. Williams, 2002).

OVERVIEW OF DEVELOPMENT UNTIL 1930

In 1906, William H. Allen influenced R. Fulton Cutting to fund theBureau of City Betterment as a function of the Citizen’s Union (Dahlberg,1966). Henry Bruere, a colleague of Allen’s at the AICP, was hired as thedirector. Because of a spectacular study that led to the removal of Man-hattan Borough President John F. Ahearn, the bureau received consider-able attention. After a year, Cutting influenced other major donors, such

Williams / EVOLUTION OF PERFORMANCE MEASUREMENT 135

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 7: Evolution of Performance Measurement

as John D. Rockefeller, to commit enough funds to create the Bureau ofMunicipal Research (as it was originally known). Allen and FrederickCleveland joined Bruere as secretary, technical director, and director,respectively—but effectively, three codirectors—of the NYBMR.

A large part of the NYBMR’s early activities consisted of providingtechnical assistance to New York City for accounting reform, essentiallyto implement the NML’s functional budget as a form of cost accounting(Bureau of Municipal Research, 1907, 1916; Gulick, 1928). Clevelandhad served on the NML committee that recommended uniform municipalaccounting (Hartwell, 1901). This accounting work also led in the direc-tion of work improvement programs, which began to reflect the influenceof scientific management about 1910. By adopting the NML’s functionalaccounting practices, a government implicitly committed itself to improv-ing its use of resources to achieve its ends, that is, it adopted the intent toimprove its work practices. However, early work improvement programsfocused on scheduling work so that it could be verified (Bruere, 1912b;Pultz, 1912; Taussig, 1912; Welton, 1912). More sophisticated workimprovement programs are found after Louis Brandeis popularized scien-tific management from 1910 to 1911 (Cooke, 1913, 1915; Interstate Com-merce Commission, 1911; Haber, 1964; Hammond, 1911; D. Williams,2002, 2003).

Possibly because of the Ahearn incident or because of encroachmenton corruption and patronage, the NYBMR quickly earned the enmity ofTammany Hall,2 which engaged in a smear campaign about the “Bureauof Municipal Besmirch.” According to NYBMR’s literature, this smearcampaign not only backfired in New York, but it also led to its extensiveoutreach activities conducting surveys and setting up research bureausacross the country (Bureau of Municipal Research, 1916; Gulick, 1928).The application of the survey to governments or communities was effec-tively an audit of the government. This audit uncovered empirical evi-dence of the governmental performance. The first city to receive these ser-vices was Philadelphia, the hometown of the NML. By 1916, there were15 bureaus of municipal research and, in 1928, there were 74 (Bureau ofMunicipal Research, 1916; Gulick, 1928). Research bureaus continued toproliferate through the beginning of World War II (Gill, 1944).

In 1912, the work of the NYBMR and the other bureaus it spawnedwere highlighted in the May 12 edition of the Annals of the AmericanAcademy of Political and Social Science. At about this time, governmen-tal surveys had been conducted in Philadelphia, Boston, and Chicago(Bureau of Municipal Research, 1916; Gulick, 1928; Woodruff, 1910).3

136 ADMINISTRATION & SOCIETY / May 2004

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 8: Evolution of Performance Measurement

The Chicago study was known as the Merriam Commission after CharlesMerriam, a Chicago alderman and future giant of American politicalscience. By this point, municipal research had become an establishedmethod for interested civic leaders to monitor government performance.The bureaus continued to use surveys to study government and to examinesocial issues into the 1920s.4

By the 1920s, cost accounting was much more sophisticated. This canbe seen in the work of A. E. Buck and William Watson (Buck, 1924; Buck& Watson, 1926). The use of the survey also continued to expand. CharlesBeard (1923), who was the director of the Training School at the NYBMRfrom 1915 to 1918 and director of the Bureau from 1918 to 1921, used thesurvey method to assist with reconstruction after the Great Tokyo Earth-quake of 1923. Lent Upson (1924b) used the survey as a method toimprove many local governments. The study of Cincinnati and HamiltonCounty, edited by Upson, is effectively a textbook in public administra-tion. Soon, he wrote such a textbook (Upson, 1926). William B. Munro(1926) wrote an alternative text that recorded 25 criteria for government.

This cumulating body of work led to the 1927 publication of ClarenceRidley’s Measuring Municipal Government, (Ridley, 1927a; Ridley,1927b), a landmark text in the development of performance measurement.It led to further work by Ridley and others at the International City Manag-ers Association where Ridley was executive director from 1928 through1956 (Ridley & Nolting, 1933, 1934; Ridley & Simon, 1938, 1948a).Ridley’s early work was soon followed by Mabel Walker’s 1929 attemptto develop an index of quality of life in cities of 30,000 population orgreater (M. Walker, 1929, 1930).5 Ridley and M. Walker’s texts were aca-demic documents. At about the same time, Edison Cramer (1929) com-pleted A Survey of the General Civic Conditions of Colorado Cities Hav-ing a Population of 2,000 or More for the Colorado Municipal League,reflecting the growing adoption by practitioners.

FORMS AND DEVELOPMENT OFMEASUREMENT PRACTICES

The developing measurement practices can be classified into severalbroad categories: research into government management, conduct of thesurvey, development of the scorecard, and deepening reliance on subjectmatter specialists for development of measures and standards.

Williams / EVOLUTION OF PERFORMANCE MEASUREMENT 137

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 9: Evolution of Performance Measurement

RESEARCH INTO GOVERNMENT MANAGEMENT

The bureaus of municipal research engaged in several forms ofresearch into government practices. First, as early as 1907, the NYBMRhelped New York City develop and implement a functional budget andaccounting system. Functional categories replaced lump-sum appropria-tions and served to relate funding to the particular work units; thus, it was aform of cost accounting. This program was designed to show wheremoney was being spent (functional accounting) and where there was pub-lic need (budgeting). It was used with success to argue for increasedappropriations for the New York City Department of Health in 1907(Bureau of Municipal Research, 1907, 1916; D. Williams, 2003).

However, by the early 1910s, the functional budget had becomeline-item budgeting and blocked administrative discretion all too well(Dahlberg, 1966). At the same time, the NYBMR had essentially infil-trated the executive branch in New York City, so it no longer particularlydistrusted government officials. It began to support appropriation inbroader categories while retaining narrow categories for accounting andbudget preparation (Dahlberg, 1966; Gulick, 1928). This refocusingamounted to a support for executive budgeting where legislative controlwas dependent on expressed, but not enacted, plans.

Both the earlier functional budget and the later executive budget con-tributed to the development of performance measurement practices. Thefunctional budget was conceived as a form of cost accounting where bothplanned and actual costs could be compared with each other and with pastyears; other communities, services, and products; and results. Thus con-ceived, the functional budget was a device to convince decision makers toallocate funds to specific needs (Bureau of Municipal Research, 1907).Functional accounting was used to verify that funds were used for thesepurposes and to show whether results were achieved.

When the NYBMR later began to advocate relaxed appropriation cate-gories, the role of reporting became more important. Executive budgetingleft the appropriating authority with weaker control over actual expendi-tures and thus more dependent on accounting and reporting to knowwhether funds were used as planned. The legislature’s power was found inthe ability to accept or reject the next budget (Goodnow, 1912).

The support of the executive budget was not entirely consistent withthe NYBMR’s original objectives, because this newer system began toclose the door on public knowledge of governmental activities. There wasa heated controversy when Maryland adopted an executive budget systemin 1916 (Allen, 1917; Chase, 1917; Schachter, 1997). Allen, who was no

138 ADMINISTRATION & SOCIETY / May 2004

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 10: Evolution of Performance Measurement

longer associated with the NYBMR, opposed this system, which had beendesigned by NML experts including his former colleague, Cleveland.Allen objected to the Maryland law because it set only limited require-ments for public, or even legislative, input. To some degree, this debatewas a proxy battle over the then pending and soon to be enacted Budgetand Accounting Act of 1921. The executive budgeteers won on all fronts.

The second category of research into government managementfocused on personnel. In the earliest period of the NYBMR, this interestcentered on getting a full day’s work out of government employees(Bruere, 1912a; Pultz, 1912; Welton, 1912). It was thought that someemployees, such as water inspectors or public works employees, put invery little actual work. To get the work done, the NYBMR recommendeda system of work planning, scheduling, reporting, and inspection. TheNYBMR also advocated the improvement of employee efficiency ratingswith interest in more detailed and real-time work records rather than retro-spective assessments. Originally, the objective was to validate claimedwork activities. After 1910, these practices became more sophisticatedunder the influence of scientific management.

During the decade of the 1910s, the NYBMR became interested instandardization of work processes in two senses. First, there was the corescientific management interest in defining the best way to do each type ofjob. Secondly, there was an interest in setting time and resources standardsfor work. This second sense was carried over from the NML’s uniformaccounting objectives. Standardization was also carried into financialmanagement with a particular focus on purchasing (Agnew, 1924;Barnum, 1924; Bruere, 1915; Burks, 1912; Connell, 1912; Cooke, 1918;Dunaway, 1916; Klein, 1912; Scott, 1924). Standardization cannot becompletely distinguished from the scorecard practices discussed below.Scores were used to gauge how well standards were met.

Standardization is one of the main practices that the research bureausborrowed from scientific management.6 Morris Cooke (1924), the mostdirect transplant from the scientific management community to publicadministration, invoked Frederick Taylor’s name to assert that theNYBMR’s budget system was an obstacle that must be gotten around. Healso criticized the NML’s accounting practices (Cooke, 1915). Most sig-nificantly, Taylor and Cooke advocated distributed management; Taylorderogatorily referred to hierarchy as military management, which heconsidered a principal obstacle to efficiency (Cooke, 1910, 1915; Taylor,1947/1903). The NYBMR and its legacy, particularly in the work ofLuther Gulick (1981), explicitly rejected Taylor’s distributed manage-

Williams / EVOLUTION OF PERFORMANCE MEASUREMENT 139

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 11: Evolution of Performance Measurement

ment.7 As a consequence, the scientific management that took root in pub-lic administration was rather mutated, borrowing more from the technicaldevelopments of standardization and cost accounting than the whole sci-entific management program.

SURVEYS

The bureaus of municipal research conducted many surveys.8 Thesecan be classified into three categories. First, the bureaus conducted studiesof the entire community environment. These comprehensive studies wereconsidered the first step in helping a community to establish its own re-search bureau (Bureau of Municipal Research, 1916; Gulick, 1928). Theyincluded a study of government power and structure, industry, social con-ditions, health, charity, and the physical environment (Bruere, 1912b;Bureau of Municipal Research, 1916). The survey collected data throughfirsthand observation, review of any available statistical data, review oflaws and government reports, and interviews with government officialsand selected citizens.9 The purpose was to gather a comprehensivedescription of the community to learn what conditions required improve-ment and what did not. One study of 10 cities that had adopted the com-mission form of government10 was conducted to compare this govern-mental form with the practices advocated by the NYBMR (Bruere,1912b).

This comprehensive survey was adopted by socially active groups whobegan to promote their own citizen surveys (Aronovici, 1910, 1916; Wis-consin Conference of Social Work, 1927).11 The citizen survey of 1920was not a survey of the citizens by the government or its proxy; it was asurvey of the government or the entire community by the citizens.

Second, the bureaus engaged in special-topic studies focusing on suchmatters as housing conditions, health, poverty, schools, recreation, or anyother social matter that was thought to require analysis (Mark, 1916;Treleven, 1912). These studies are the bureaus’ most direct continuationof the settlement house social survey. The function of the bureaus’ specialtopic studies was to highlight social problems that needed addressingthrough public policy. These surveys got the facts before the public in amanner that made ignoring problems difficult. For example, surveys thatdemonstrated an exposure to the risk of typhoid and death (as with a 1916survey of Portsmouth, Ohio), when the technology to avoid it was wellknown, seemed obviously to demand implementation of the technology(Mark, 1916).

140 ADMINISTRATION & SOCIETY / May 2004

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 12: Evolution of Performance Measurement

The third form of survey was the government study (Beard, 1923;Bureau of Municipal Research, 1916; Gulick, 1928; Upson, 1924b).12

This sort of survey was similar to the special topics studies with twoexceptions. First, government surveys did not focus on social issues orsocial problems; they examined government conditions. Second, thebureaus engaged in a considerable number of these surveys and developedthem to a greater degree than other surveys. This sort of survey could beconsidered a program audit of the entire local government. It examinedmatters of governance such as the laws and powers of the government;the means of public participation and citizen access to government; andthe financial condition, budgeting, accounting, purchasing, and other ele-ments of financial management, personnel management, and the func-tions of government agencies. It also addressed the delivery of servicessuch as police, fire, health, and public works (Bruere, 1912b). It includedfunction-by-function recommendations for modernization of thegovernment.

These surveys were expected to uncover objective facts, but they werenot necessarily neutral. Bruere’s New City Government (1912b) wasundisguised in its preference for strong executive models of governmentin opposition to legislator-administrators. The objective facts werethought to demonstrate the reasonableness of this preference.

Two studies worth note are the study of Cincinnati and HamiltonCounty, Ohio, led by Upson (1924b) of the Detroit Bureau of MunicipalResearch and the study of Tokyo conducted by Beard (1923), who hadbeen the director of the NYBMR Training School and the NYBMR. Thesestudies, conducted in the early 1920s, reflect both the government re-search discussed in the previous section and the application of surveymethods of this section. They examined both the delivery of governmentservices and the conduct of government itself.

THE SCORECARD/INDEX

The survey developed into the scorecard by the end of 1910s and wasbecoming the index at the end of the next decade (“A Score Card for WestVirginia Cases,” 1923; Ayres, 1920; Federal Council of Citizenship Train-ing, 1924; “How One City Scored Itself,” 1923; Hudson, 1926; “Measur-ing Government Efficiency,” 1924; Ogburn, 1917; M. Walker, 1930).However, the scorecard is also linked to the use of standards. The score-card used a point system to standardize results of the survey (Bruere,1912b; Commons, 1908; Richards, 1915). It provided the opportunity to

Williams / EVOLUTION OF PERFORMANCE MEASUREMENT 141

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 13: Evolution of Performance Measurement

compare various communities on a single index (Ayres, 1920; Bracy,1924; Cramer, 1929; Deacon, 1926; Hudson, 1926; Ogburn, 1917;Palmer et al., 1925; Schneider, 1916; M. Walker, 1929; H. Williams,1927). Point values were assigned to each survey element reflecting howwell an object of observation met a standard. The standard could be eithertechnical or normative. The point total represented the quality of commu-nity life or the quality of some particular aspect of community life. Thus,communities could be objectively compared, and local residents couldknow how their city compared with the country. However, there weremany scorecard schemes developed over the 2 decades of the 1910s and1920s, which confounded the comparative objective.13

Two technical difficulties with the scorecard were with the methods ofassigning points and of assigning relative values to each item. For the ear-lier problem, some scorecards gave instructions for some or all items(Strayer, Engelhardt, & Elsbree, 1927), whereas others simply listedthe maximum available points beside a question or topic (“A Score Cardfor West Virginia Cities,” 1923; Wisconsin Conference of Social Work,1927). The points themselves reflected degrees of approximation to adesired level.

Relative values might be set as equal, set based on the judgment ofauthors, or left to the user to decide (Commons, 1908; Cramer, 1929;Deacon, 1926; Ogburn, 1917; Palmer, 1926). In 1924 and 1925, Ameri-can Political Science Association (APSA) round tables recommendedcontinued research in various aspects of relative values (Cottrell, 1925;Upson, 1924a). For this period, the only identifiable research-based esti-mate of relative values is with a proposed health department scorecard thatset weights for health work based on damage to be overcome by healthservices (Schneider, 1916). The scorecard was not necessarily a primarydata collection instrument; it could be used to collect elements of sec-ondary data into one place (Ogburn, 1917; M. Walker, 1929).

The scorecard was borrowed from earlier use in agriculture on at leastthree occasions. In 1908, John Commons borrowed a scorecard from“standardizing and grading agriculture products, such as wheat, corn,oats, butter, cheese, horses, cows, pigs, and so on” (p. 126). He used thescorecard to normalize housing characteristics, so his approach is anancestor to the broader-than-economics use of indexes.14 In 1907, theNYBMR replicated a copy of a scorecard used for dairy inspection in itsaccount of its work at the New York City Department of Health (Bureau ofMunicipal Research, 1907). A few years latter, Bruere (1912b) used a

142 ADMINISTRATION & SOCIETY / May 2004

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 14: Evolution of Performance Measurement

scorecard approach in rating 10 commission governments apparentlyhaving adopted it from his own prior experience at the NYBMR. How-ever, he used the term scorecard in quotes (Bruere, 1912b, p. 21), possiblyreflecting an awareness of Commons’s work. In 1915, E. G. Richards bor-rowed the scorecard from dairy inspection to aid in fire insurance ratingof cities.

By and during the 1920s, the scorecard became a robust instrument. Itwas used by professional and community groups to rate communitiesand governments, it was used by experts as a basis for measuring govern-ment performance, and it was used by specialists to measure the perfor-mance of particular government activities such as the services of thehealth department (Ayres, 1920; Federal Council of Citizenship Training,1924; Palmer, 1926; Strayer et al., 1927; H. Williams, 1927; WisconsinConference of Social Work, 1927).

William Ogburn (1917), Mabel Walker (1929, 1930), Edison Cramer(1929), and others transformed the scorecard into an index. Ogburn’s1917 survey of 36 major U.S. cities uses secondary data to compute anindex that aggregates scores on 17 criteria: wage rate, cost of living, deathrate, infant mortality rate, population married, church membership, childlabor, parks, pavement, fire loss, public properties, circulation of librarybooks, school attendance, school property, teachers salaries, number ofpupils to a teacher, illiteracy, and the number of foreign-born personsunable to speak English.15 These criteria are most similar to those of Allenand Bruere, who also focused on community matters, rather than Beardand Upson, who focused solely on government, or Ridley and M. Walker,who focused solely on government service.

M. Walker (1929, 1930) used municipal data from various reports andalmanacs to develop a grade for 160 cities with a population of 30,000 ormore. M. Walker emphasizes the word results, which she uses to distin-guish her rating from measures of government process or activity.Cramer’s (1929) index for Colorado cities is based on the data collectedfrom the cities themselves through either personal visits or responses toquestionnaires. The data comprise a mixture of performance data, such asthat M. Walker emphasized, and cost-per-capita data for fiscal year 1928.Component scores in seven categories are separately ranked for cost andaccomplishment. Ranks are added across service categories to a finalscore.16 This study followed a study of 1923 expenditures by WilliamBracy (1924) that also contained both cost and service data. Bracy’s studydoes not reveal enough information to show exactly how these two types

Williams / EVOLUTION OF PERFORMANCE MEASUREMENT 143

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 15: Evolution of Performance Measurement

of data were combined, in part, because the sources of cost are not, as withCramer’s study, directly associated with services. So, Cramer’s index isconcerned with both productivity in the narrow sense and with outcomes.

SPECIALIZATION AND SPECIALISTS

In 1907, Allen was equally at home with hospital, charity, or schoolefficiency. He discussed criminology and even efficiency of religion. In1912, Bruere (1912b) discussed health, crime, fire, public works,accounting, budgeting, purchasing, and citizen participation. Ogburn’s1917 study addressed 17 categories of service. Upson’s (1924b) report onCincinnati and Hamilton County, Ohio, also reflects extensive breadth. In1927, Ridley (1927a, 1927b) narrowed the discussion to four services: fireprotection, health, police, and public works. He provided only brief guid-ance for measuring other functions. Ridley’s work became the foundationfor future development, so this narrowing is significant.

One characteristic that distinguishes fire protection, health services,and police work from the broad range of matters discussed by Allen orUpson is that measurement of these services and their outcomes had longbeen the study of subject matter experts. As early as 1912, Henry Bruere(1912b, p. 298 ff) cited the work of the National Board of Fire Underwrit-ers, which kept track of fire loss. After 1915, the underwriters used ascorecard approach to rate cities on risk of fire loss based on a combina-tion of fire fighting capacity and environmental conditions (Richards,1915; H. Walker, 1926a, 1926b, 1926c, 1926d). Fire loss was a clear mea-sure of results, one of the few that were available at that time. AlthoughRidley (1927b, pp. 13-22) recommended a change in the unit of measurefor fire loss, he accepted much of the fire underwriter’s program.

Measurement of health services was the beneficiary of even moreexpert study. From the start of the municipal research movement, infantmortality was viewed as the most important indicator of local conditions(Allen, 1907, p. 72; Bruere, 1912b, p. 27). Throughout the ensuing 20years, studies frequently return to measuring health status, particularly ofchildren. Although various social surveys were in use, a frequent topicwas health, which often meant health of children. Innovators in measuringcommunity health included the American Public Health Association, theAmerican Child Health Association, and the federal government. Typi-cally, they developed a scorecard-style rating system (Committee on Ad-ministrative Practice, 1926; Palmer et al., 1925; U.S. Public Health Ser-

144 ADMINISTRATION & SOCIETY / May 2004

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 16: Evolution of Performance Measurement

vice, 1926). The scorecards were used to judge health departments orhealth services as much as health outcomes. For example, the AmericanChild Health Association study examined service capacity or deliveryissues such as the number of nursing visits per 1,000 infants along with, ofcourse, infant mortality (Palmer et al., 1925, pp. 136-137). Other matterssuch as enforcement of laws were sometimes addressed (Committee onAdministrative Practice, 1926, p. 55). However, at no point throughoutthis period did these professionals lose track of infant mortality. Ridley(1927b, p. 30) devoted a considerable portion of his text to discussing thevalidity of these measures, which he concluded to be reasonably valid.

Police work had also received some prior expert attention. However, ithad not reached the same level of sophistication as fire protection or healthservices. Ridley (1927b, p. 31) considered the objective of police work tobe preventing crime, not solving crime (convictions), as it had been 20years earlier (Bruere, 1912b, p. 279). Ridley remained concerned witharrests and convictions but was also interested in the reporting of crimes(complaints) as a measure of crime prevention.

Although Ridley knew that some communities kept crime statistics,there was no method of comparing them (Quigley, 1925; Ridley, 1927b).As he said, “No real standard of measurement for police work now exists”(Ridley, 1927b, p. 31). In 1924, Hugh Lester recommended uniformcrime classifications. In 1930, the International Association of Chiefs ofPolice began collecting uniform national crime data. Their work was asso-ciated with the bureaus of municipal research through the service of LentUpson (Committee on Uniform Crime Records of the International Asso-ciation of Chiefs of Police, 1930; Smith, 1929). As with health and fireprotection, the measurement of crime prevention and detection was domi-nated by experts; however, the measurement itself was not as advanced.

Although fire, health, and police services were the object of expertstudy, measurement of public works had hardly begun (Ridley, 1927b,pp. 39-46). Ridley (1927b) argued that public works cannot be measuredin aggregate impact on the community but must be measured separatelyfor each service. He discussed streets, snow removal, street cleaning,street lighting, sewers, refuse disposal, and water supply. For each ofthese, he looked to measures in the form of area or quantity supplied ormoved. Emphasis is on getting an exact unit that can be compared acrosscommunities. For all of these areas, he characterized his recommenda-tions as suggestions, reflecting a lack of more than incidental use bygovernments.

Williams / EVOLUTION OF PERFORMANCE MEASUREMENT 145

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 17: Evolution of Performance Measurement

I will discuss two specific sets of Ridley’s suggestions, those for streetcleaning and sewage. For street cleaning, Ridley (1927b) went to somedegree to be specific about area and raised additional issues such as “gradeof street; type of pavement; character and amount of traffic; whethercleaning is done by day or by night; quality and character of the refuse;paved or unpaved intersecting streets; and, lastly, the standard of cleanli-ness” (p. 42). These standards reflect little advance and, in some ways, adecline from the work of the Census Bureau as reported by Meyer in 1910where specifics of refuse are not so clearly addressed but additional con-sideration is given to frequency of cleaning. For sewage, Ridley (1927b,p. 43) discussed factors contributing to the cost of system constructionand maintenance. Nowhere did he ask what the purpose of a sewage sys-tem might be. This stands in stark contrast to the view offered by WalterWilcox in 1896: “Hence the benefit of a sewerage system should be mea-sured in terms of decreased mortality rather than in terms of increased pro-ductivity” (p. 378). In summary, public works measurement was no moredeveloped in 1927 than in 1910 and had perhaps somewhat regressed.

The bulk of Ridley’s text discusses these four areas.17 With the firstthree, subject matter specialists (fire insurance companies, public healthworkers, and emergent criminologists) took charge and developed mea-surement techniques. Health and fire protection had seen considerableadvances, whereas police work was under study. The measurement ofpublic works reflects little advance over the earliest studies.

Mabel Walker (1930) similarly narrowed her discussion to three prin-cipal categories: public works, protective service, and welfare. She in-cluded five measures for each of the first two categories and six for thethird. Her public works measures focused on street cleaning, garbage col-lection, sewerage, and paved highways and were volume measures simi-lar to Ridley’s, although she was interested in the proportion of the popu-lation served with sewage systems. Her protective services includedRidley’s other three categories—police, fire, and health. She was unableto obtain any useful data on police. Her measure of fire protection wasloss. She measured health services through death rate, communicable dis-ease rate, and infant mortality rate. Her welfare categories were schools(primarily), libraries, and parks. These services were measured in terms ofvolume of services provided including the number of books circulated,the area of park land per capita, and, for schools, the proportion of the pop-ulation served. Although her categories are somewhat broader thanRidley’s, they are still considerably narrower than Upson’s or Allen’s. Herthree additional objects of measurement also benefited from expert atten-

146 ADMINISTRATION & SOCIETY / May 2004

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 18: Evolution of Performance Measurement

tion (Ayres, 1920; Bureau of Labor Statistics, 1928; Emmons, 1926;Gulick, 1929; Hansen & Wheeler, 1927; Martin, Davis, & Keppel, 1926;Strayer et al., 1927).

Six of Cramer’s seven categories match M. Walker’s (Cramer, 1929;M. Walker, 1930). His additional category is water works, a municipalenterprise. In the earlier study, William Bracy (1924) used essentially thesame categories as Cramer. Cramer’s study also includes discussion ofother matters such as elections, tax rates, and debt, but these factors are nottabulated in the rankings. Cramer’s study differs from Ridley’s methodsor M. Walker’s index in that the objects of study originated with Bracy in1924. This earlier origin can be seen in the collection of election data,which reflect the examination of citizens rather than government services.However, these citizen data are not included in Cramer’s final index.18 It isnot possible to tell from Bracy’s report whether he included citizen data inhis index. Although Cramer’s reasons for including categories may differfrom M. Walker’s or Ridley’s, the practical effect is the same: Fire loss,infant mortality, acres of parks per capita, and volumes of books circu-lated per capita were, by the late 1920s, the object of expert measurement.

Thus, by the late 1920s, performance measurement had come to reflectobservation by experts. Overall, government was no longer a frequent ob-ject of examination. Performance measurement was used to examine howgovernment services were delivered to the public, primarily by focusingon the volume or consequences of services. And the survey had evolved tothe scorecard and then the index.

CHANGES AND CONTINUATION

How did performance measurement change between the early work bythe NYBMR and the end of the 1920s? What remained unchanged?

SOPHISTICATION

There was increased sophistication of methods in some areas, particu-larly health and fire protection. Police work measurement had begun tobecome sophisticated. Other areas of study such as education, libraries,and parks were developing specialized technical performance measure-ment, but development was uneven. Some types of activity, such as publicworks, had seen little change over these 2 decades.

Williams / EVOLUTION OF PERFORMANCE MEASUREMENT 147

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 19: Evolution of Performance Measurement

Another technique that saw more sophistication was cost accounting.At the beginning of this period, cost accounting was addressed throughfunctional budgeting and accounting and the calculation of work unitcosts. In 1926, A. E. Buck and William Watson rejected functionalaccounting as cost accounting and instead worked with cost centers.Although cost centers can be organizational units, they did not have to be.Buck and Watson (1926) defined three sorts of cost units: production unitswhere there is a tangible output of an activity; work units where there is notangible output—they give the example of student-hours of instruction;and service units where there is a mixture of tangible and nontangible out-put. They discussed such matters as indirect costs, overhead allocation,and consideration of controllable and uncontrollable costs. In summary,they had adopted standard industrial cost accounting.

Buck and Watson (1926) were clear that cost accounting is acost-of-production concept and is used to determine whether the produc-tion cost is extraordinary. Such answers, they said, do not show whetherthe end user gets ultimate value out of the service; for example, costaccounting can say how many days of care a patient received and the costof delivering but not whether the patient is left better off. Buck (1924) wasnot uninterested in results in government but held that results are simplynot what cost accounting studies. Buck and Watson proposed standardcosts to which an administrator can compare actual costs to gauge whetherthey are reasonable. Unlike early innovators (Bachman, 1912), they ex-pressed no interest in the possibility that costs are too low.

QUANTIFICATION

The sophistication of cost accounting reflects a general movementtoward quantification. The use of scorecards also reflects this quantifica-tion. With a scorecard, communities can be compared with a single num-ber. The scorecards also became more sophisticated, as with the moresophisticated weighting of points near the end of this period.

Another example of quantification is a study directed by Upson(1924b) concerning the governments of Cincinnati and Hamilton County,Ohio. This study quantifies something in nearly every section such as theproportionate disposition of cases by judges (Upson, 1924b, p. 229), thecomparative number of crimes in various localities (p. 241), fire loss(p. 265), mileage by pavement type (p. 293), relative efficiency of streetcleaning (pp. 317-322), average load of a garbage trip (p. 328), relativecost of cleaning the court house (p. 337), death rates (p. 352), and relative

148 ADMINISTRATION & SOCIETY / May 2004

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 20: Evolution of Performance Measurement

access to parks (p. 292). Where quantification is not possible because oflack of data, as with the comparative efficiency of building inspectionwith transportation by foot or by automobile, there is an effort to estimate(Upson, 1924b, p. 274), and Upson later carried out this measurement(Upson, 1926).

Quantification was not universal; Charles Beard’s The Administrationand Politics of Tokyo (1923) contains very little quantitative data. How-ever, even it contains many recommendations for Tokyo to implementquantitative measurement. For Upson (1926), what was not yet measuredwould be in the future: “It may not be long until bookkeepers, clerks, andstenographers will be required to measure and report on a standard day’swork” (p. 148). In later times, there were efforts to implement these verymeasures (Bills & Dickenson, 1937; Curtis, 1937; Rosenberg, 1948). Thiseffort was not always appreciated, as William D. Carey (1946) put it,“Occasionally, some rear-echelon genius with a slide rule will devise unitsof weighted measurement for each individual type of action, requiring thepoor wretch in the field office to convert his telephone calls and paperactions into points and decimals” (p. 24)

With scorecards, quantification brought problems to overcome. First,scoring required assigning of scores to the component elements. As themodern survey researcher might observe, this is tricky business. The scoreis not an unprocessed observation, which itself requires some care to cap-ture reliably. It is, instead, a judgment, which is very difficult to makeobjective at all. As time progressed, efforts were made to guide the obser-vation and scoring. In Mabel Walker’s work, where scoring has becomeconstituent to indexing, the scoring is made the rigid consequences of for-mulas. However, she discussed the character of her own problems work-ing out these formulas and reflected the need for continued input of ana-lyst judgment (M. Walker, 1930, p. 67).19 In effect, she declared that somelevels of achievement are unrealistically demanding.

The second problem is one of assigning weights. To combine scoresfrom various observations, one must assign weights to each constituentelement. In 1912, Bruere (1912b) assigned equal weights to the elementsof his study. In 1908, Commons dismissed the problem of weights assomething that any researcher could settle to his own satisfaction. Thefire underwriters assigned their weights based on relative contribution tofire risk (Richards, 1915). In the 1920s, one health services scorecardassigned weights based on relative damage to be overcome through vari-ous services (Schneider, 1916). In 1924, the American Political ScienceAssociation convened panels and committees focused on addressing the

Williams / EVOLUTION OF PERFORMANCE MEASUREMENT 149

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 21: Evolution of Performance Measurement

problem of setting weights for scorecards (Upson, 1924a). In her 1929study, M. Walker (1930) assigned weights to provide roughly equal bal-ance to her three major categories of government service. Cramer (1929)weighed seven categories of service equally and two categories of mea-sure equally.

Weighting was important and not easily solved.20 Here we come face toface with the growing belief that science is value neutral. This view holdsthat science studies facts that include no normative component. The aca-demic side of public administration began to adopt this view. Meanwhile,by the late 1920s, applied public administrators were adopting a norm ofneutrality, which Louis Brownlow, a former city manager and a futureadvisor to President Franklin Roosevelt, later called a “passion for ano-nymity.” He meant that public administrators, or at least some of them,should act as neutral agents (Neustadt, 1963).21 The objectification ofscores and indexes aimed at the public administrators’ norm of neutralityor the scientists’ value-free observation. But the scorecard and the indexare not neutral. The selection of items, weighting of selected items, andassignment of scores to observations all reflect normative decisions.Quantification did not eliminate normative decisions; it simplycompressed them into a concise result.

RESULTS

It has been alleged that early 20th-century public administrators wereinterested primarily in tax savings or, at most, in the narrower sense ofefficiency, that is, getting the most output from the least input (Bouckaert,1992, p. 17; Stivers, 2000; Waldo, 1948). D. Williams (2003) has shownthat the early innovators did not believe that management efficiencywould guarantee good results, but they believed that management in effi-ciency would likely lead to poor results. In part, this confusion rests on themeaning of the word efficiency. In the current era, organizational effi-ciency is a nearly mechanical notion. At the beginning of the 20th century,the mechanical analogy was less exact. In the introductory chapter of Effi-cient Democracy (1907), Allen related a story of the Lyman School forBoys:

A chart was prepared for the Chicago Exposition to portray graphically thisinformation. But to the chagrin of all, the chart when completed showedthat a distressingly large percentage of the boys were serving second andthird sentences at various penal institutions, while a painfully small

150 ADMINISTRATION & SOCIETY / May 2004

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 22: Evolution of Performance Measurement

percentage could be referred to with pride. The directors believed theirchart and devised for future guidance a new test, namely, results counted,efficiency. (p. 10)

Here, efficiency is defined by results, not the reverse. Throughout thisperiod, the point was normally to obtain good results, not merely to savemoney, although there were advocates of the latter. With Ogburn, we seethat results such as infant mortality or fire loss remained of interest inthe late 1910s. With Ridley, M. Walker, and Cramer, results continued tobe of interest in the late 1920s. Categories for which results were observeddid not much change. For example, each of Ridley’s categories wasaddressed in Bruere’s The New City Government (1912b). BetweenBruere and Allen (1907), most of M. Walker’s and Cramer’s categorieswere addressed, as well.

The concept of results was never adequately clarified. Infant mortalityand fire loss clearly relate to socially desirable outcomes. However, vol-ume measures, such as the number of books circulated, are output mea-sures. Socially desirable outcomes would depend on the nature of thebooks and what was done with them. Park acreage per capita is a servicecapacity measure. To some degree, results depended on what could bemeasured, for example, Ridley’s recommended measuring police serviceoutcomes with complaints, a potentially poor proxy for the intensity ofcriminal activity. This early literature anticipates performance monitoringor uses of secondary data, not program evaluation studies. So, measuresmust be easily and routinely observable.22

There were competing views about results. The increasing sophistica-tion of cost accounting reflects a trend toward getting a clear grasp of gov-ernment costs. Costs were growing at an alarming rate, often attributed tothe ever-expanding scope of government. However, Upson (1922) exam-ined the link between costs and expanded scope and concluded, “In truth,these new activities are not as costly as the expansion and ‘doing better’old ones” (p. 318). He recommended what today would be called re-engineering government: “For example, lower police costs cannot comethru improved foot patrol methods, but by daring to question the effi-ciency of this whole activity” (Upson, 1922, p. 319). As Cramer’s (1929)study shows, the aim was to get better results while containing costs.23

Overall, the development is mixed. The concern over results neitheradvanced nor retreated. Some technical advances were achieved. How-ever, some needed clarifications were not made, and concerns for purecost suppression continued to compete with interest in achieving results.

Williams / EVOLUTION OF PERFORMANCE MEASUREMENT 151

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 23: Evolution of Performance Measurement

ADVOCACY

At the onset of these activities, municipal research was promoted asadvancing social causes by increasing government’s capacity to makebetter communities. The NYBMR, particularly Allen, did not shrink atdemanding services, especially for the poor or dependent. During the1910s, the social survey was a device for validating demands for increasedgovernmental services from sewage systems to fire inspection to educa-tion. However, in the 1920s, this advocacy role began to fade. The newprofessional city managers took a stance of neutral competence ratherthan policy advocacy.

During this same period, measurement moved away from the qualita-tive social survey that conveyed the rich contextual nature of social prob-lems to the quantitative study that provided a summary score or set of pro-portions that represented effectiveness and efficiency. The summary scoredid not necessarily provide the compelling story that was often communi-cated in the report of social surveys, so the need for corrective action couldbe more easily ignored. Also, summary scores might provide less specificguidance for improvement. A community might improve its score byaddressing a number of smaller and less important problems rather thanfacing its more severe shortcomings directly.

Measurement also became increasingly the province of experts andacademics. These groups were involved in measurement from the start.However, during the 1910s and early 1920s, the social survey was fre-quently conducted by citizen volunteers. In the middle of the 1920s, aca-demics took a greater lead, attending to such matters as the appropriateweights of the various components of the survey. After Karl Pearson andR. A. Fisher made statistics into a general tool for study of social data inthe early 20th century, the use of more sophisticated methods for empiri-cal research began to spread through the academic community (Porter,1986; Stigler, 1986, 1999). This new sophistication made empirical re-search the exclusive province of experts. Academics were not necessarilyas interested in advocacy as were citizen activists.

Ridley, himself, was an expert and an academic. His MeasuringMunicipal Government (1927b) is a product of his dissertation at theSchool of Citizenship and Public Affairs at Syracuse University. In 1929,he became the executive director of the International City Manager’sAssociation (as it was then named) where he stayed until 1956. He taughtmeasurement of government activities at the University of Chicago duringthe 1930s (Augier & March, 2001). He became a leading spokesman for

152 ADMINISTRATION & SOCIETY / May 2004

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 24: Evolution of Performance Measurement

governmental measurement, a practice that became more associated withmanagement competence than policy advocacy.

COMMUNICATION WITH THE PUBLIC

The three original codirectors of the NYBMR—Allen, Cleveland, andBruere—made repeated efforts to educate the public about government.Providing for informed public opinion was one of Allen’s (1907) fiveprinciples of efficient citizenship to help the public fulfill their citizenshipduties, a view that Bruere (1912a) also articulated. Central to efficient citi-zenship was the collection and analysis of data about government thatwould then be made public through reports and news accounts. Clevelandwas, perhaps, less committed to this view. Jonathan Kahn (1997) andHindy Lauer Schachter (1997) argued that Allen was forced out ofNYBMR largely because of his conflicts with Cleveland and patrons overpublic information.

William Allen went to such great trouble to communicate with the pub-lic that he came into conflict with the Rockefeller foundation, theNYBMR’s chief patron (Schachter, 1997). Not only did the NYBMR fre-quently distribute material directly to the public (Kahn, 1997; Schachter,1997), the research bureaus that sprang up in its likeness typically createdpublic communication organs such at the Toledo City Journal or the Phila-delphia Citizens’Business. The most intense form of public communica-tion may be the Budget Exhibit, a public event communicating govern-mental performance data to the public and conducted by the NYBMR in1908 and 1909 and by New York City with NYBMR assistance in 1910and 1911. The Budget Exhibit reached more than 1 million people in 1911(Kahn, 1997; D. Williams, 2003). The later conflict over executive bud-geting centered, in part, on the abandonment, or at least deemphasis, of thepublic information objective for analytic practices (Allen, 1917; Chase,1917; Schachter, 1997).

Kahn (1997) argued that these sorts of activities reflected intellectualand political elitism; that is, the point of these efforts was to overcome thepublic’s inability or unwillingness to understand government. There is,probably, some truth to this assertion. The very idea of an organization forresearch into government rests on the idea that citizens require the aid ofexperts to mediate information such as budgets and government perfor-mance. This plan suggests that experts are elite consumers of governmen-tal data who can determine what the public and public officials really need

Williams / EVOLUTION OF PERFORMANCE MEASUREMENT 153

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 25: Evolution of Performance Measurement

to know. This plan borders on paternalism. It also naïvely risks treatingexperts’value judgments as if they were facts. Still, the plan is not unreal-istic or necessarily anti-democratic. It is not unreasonable to suppose thatvoluminous and complex information about government requires somemediation for public consumption. The whole point of such activities asperformance measurement (in 1907 or in 2003) would seem to be just that:Mediate difficult information to aid consumers. Any such plan implicitlyassumes that the mediators are more expert in accessing the information;thus, it is somewhat elitist. But, there appears to be no way out of thisresult except to leave the public to its own devices in trying to understanddifficult material. The attempt to mediate likely incorporates the media-tors’ value judgments, as well. But this effect is an inescapable conse-quence of analysis, not evidence of deliberate undermining of democracy.

In 1923, Upson said, “Given to the press and the public, these operatingstatements would furnish an outside check, that should go a long waytowards stimulating a degree of efficiency in public business at presentunknown” (p. 122). This statement reflects two matters. First, as late as1923, there was still considerable support for communicating perfor-mance data to the public. Second, at the same time, there was limited prog-ress in producing the sort of performance data to be so communicated. In1927, Ridley was not clear who he intended to receive the products of hismeasurement. In part, this silence may be because he was defining mea-sures, not using them; but he continued into a career focused on servinggovernment professionals and, only indirectly, the public. AlthoughMabel Walker produced an analysis comparing governmental contribu-tion to quality of life in 160 cities, she published it in her dissertation(1930) and in The American City Magazine (1929), a trade journal forpublic officials, rather than in the popular press. However, in her disser-tation she took note that Citizen’s Business, a publication of the Phila-delphia Bureau of Municipal Research intended for the public, picked upand summarized her American City Magazine article (Bureau of Munici-pal Research of Philadelphia, 1929; M. Walker, 1930, p. 111). Cramer’s(1929) report appears to be more clearly targeted to the public.

In summary, in the early days of the NYBMR, reports of what and howthe government was doing were clearly intended for the public. By 1930,this purpose had faded in significance and was partly replaced by anobjective to communicate to government professionals.

154 ADMINISTRATION & SOCIETY / May 2004

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 26: Evolution of Performance Measurement

GOVERNANCE

Ridley’s largest change from prior practices is not the inclusion of vari-ous services as the subjects of observation—all of the services he dis-cussed were examined by Bruere and Allen—nor even particularly theaddition of expert measurement. It is the narrowing of focus to govern-ment service alone. Before Ridley, municipal research included the mea-surement of governance, or the study of how well the government com-plies with the norm of democratic government (Allen, 1907; Beard, 1923;Bruere, 1912b; Upson, 1924b). In Efficient Democracy, Allen (1907,p. 273) suggested that the level of public participation in government de-cision making is an important measure of democracy. Bruere (1912b,pp. 376-400) took this advice and devoted 25 pages of his study of com-mission government to the examination of citizen control and coopera-tion. It was also during this period that the citizen survey movement devel-oped; this movement might be viewed as a community and governmentalself-study from the point of view of citizens (Aronovici, 1910, 1916). Inhis 1923 study of Tokyo, Beard (1923, pp. 137-161) devoted more than aneighth of the entire study to “The Spirit and Practice of Self-Governmentin Tokyo.” Even Upson’s (1924b, pp. 521-527) study of Cincinnatiincludes a short section on “Citizen Influence on Government.” WilliamMunro’s (1926) recommendations from the same period as Upson andRidley are much more heavily oriented to governance and consider suchmatters as whether the city has home rule, the characteristics of the citycharter, the number of elective offices, the size of council, and the terms ofoffice. Although Ridley acknowledged Munro, whose focus is primarilygovernance, Ridley’s own focus was on government service.

In fact, neither Ridley nor M. Walker included any measure of gover-nance. Although Cramer (1929) considered voter participation, heexcluded this measure from his index. Ridley and Cramer were inter-ested in how government agencies performed assigned tasks. Similarly,M. Walker’s question was: What is the social product of governmentactivity? By asking narrower questions, they gave up something andgained something. They gave up the political component of the municipalresearch program. At least in principle, fire protection, or even the volumeof library circulation, can be equally delivered by democratic or autocraticgovernments.

What they gained was focus. This focus permitted continued advancesin measuring the particulars of government. One object of study, the

Williams / EVOLUTION OF PERFORMANCE MEASUREMENT 155

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 27: Evolution of Performance Measurement

adherence to democratic form, was no longer to be evaluated; however,another important object, satisfaction of public demand, remained impor-tant. Throughout the 20th century, Americans have sought increased pub-lic service but have resisted taxes. Focused performance measurementcould serve as a technical tool to meet these objectives. It is as a focusedtechnical tool that performance measurement became intimatelyassociated with productivity improvement.

DECLINE IN ACTIVISM

Several of the changes over this period—reduced advocacy, modera-tion of interest in communicating with the public, and reduced interest ingovernance—collectively amount to a decline in activism among govern-ment reformers. Why did the activists of 1910 turn into the technicians of1930? A full exploration of this question is beyond the scope of this arti-cle. However, we can consider some of the major theses that have beenraised. Haber (1964) suggested that the combined effect of the First WorldWar and the growing distrust of socialism killed activist progressivism.On this thesis, the decline in public administration is symptomatic of thebroader decline across the society. Stivers (2000) suggested that thechoice to develop in the direction of technical competence was implicitlya defensive masculine form of social activism that contained in its originsthe destiny of preference for technical neutrality. Schachter (1997) andRoberts (1994) suggested that the Rockefeller foundation deliberatelykilled social activism in public administration. Schachter suggested thatthis suppression resulted from policy differences between Rockefeller’strusted insiders and William H. Allen in particular. Roberts argued thatRockefeller’s choices were between supporting the development of aseemingly neutral public administration and providing no support at allbecause the public would not tolerate Rockefeller activism.

One should not overlook the fact that the political environment hadchanged. In the 1890s when the NML was first conceptualizing thesedevelopments, Jeffersonian government prevailed with minor inroads ofHamiltonian resurgence in the growing power of some mayors. By the1920s, the Hamiltonians had won; mayors, governors, and city managerswere powerful. After the passage of the Budget and Accounting Act of1921, it was just a matter of time before the president would be the mostpowerful official of all. Activism arose in the transition period butdeclined when this period ended. In the transition period, performancemeasurement served activism as it provided an explanation as to how leg-

156 ADMINISTRATION & SOCIETY / May 2004

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 28: Evolution of Performance Measurement

islators were not giving up very much power to executives. When thisexplanation was no longer needed, performance measurement was notdiscarded; it was refocused to serve the more limited technical needswithin the executive.

CONCLUSION

By 1930, performance measurement had a quarter century of practicebehind it. It evolved from a more inclusive study of government to a nar-rower and more sophisticated study of government service. It was no lon-ger closely linked to policy advocacy; however, its apparent objectivityand neutrality hid unexamined value judgments. It was less closely linkedwith the public but more closely linked with government management.Methodological sophistication brought new problems, such as difficultieswith relative weights for indexes. Quantification sometimes became anend in itself. Critical concepts such as results remained unclarified.

By the end of this period, performance measurement was a practice,not simply an objective or recent innovation. M. Walker and Cramer’scomparative studies depended, in part, on routine collection of data thatwas only contemplated in 1900. Ridley had begun to bring intellectualrigor to outcomes measurement. The qualitative survey had becomethe quantitative index. Governmental cost accounting had abandonedthe close link to budget accounting thereby allowing more effectivedevelopment.

This history also shows us that performance measurement does notrefer to a particular empirical technique. Instead, it refers to the applica-tion of relevant techniques to the problem of observing government atwork; after Ridley, that meant the delivery of government services. Theseempirical techniques included budget and cost accounting and collectionof data on output of government activities and the social conditions thatcould reasonably be thought to depend to some degree on successful gov-ernment service. Techniques, such as either the modern survey or thesocial survey of 1900, are relevant to the degree that they get to the objec-tive observation of the method, efficiency, product, or outcome of gov-ernment service. In 1906, the point of such observation was to inform thecitizenry so that they could fulfill citizen duties. By 1930, the point hadbecome much more management oriented—to assist the mayor, city man-ager, governor, or expert administrator to get good results out of limitedresources.

Williams / EVOLUTION OF PERFORMANCE MEASUREMENT 157

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 29: Evolution of Performance Measurement

NOTES

1. Normalization by population is a rudimentary cost accounting idea. It shows costs asthey are experienced by the taxpayer but not as they are generated by activities. However, in1900, this rudimentary approach to normalization reflected considerable advance over thenot-too-distant past.

2. Tammany Hall was the most powerful political machine in New York City, and it wasassociated with corruption for a century after 1850.

3. The Philadelphia and Chicago studies had technical assistance from the New YorkBureau of Municipal Research (NYBMR).

4. The Institute of Public Administration, which is the reorganized NYBMR, continuesto conduct studies that could reasonably be thought to descend from the survey.

5. Ridley completed his text as his dissertation at Syracuse University. M. Walker com-pleted hers as a dissertation at Johns Hopkins University. During the time Mabel Walker pre-pared her dissertation, Frank Goodnow, the renowned political scientist, was president ofJohns Hopkins, which was also closely associated with the Institute of Government Research(Brookings Institute) that was headed by William F. Willoughby.

6. Governmental cost accounting, another technical area, also borrowed directly orindirectly from scientific management. Changes from earlier functional accounting mayhave been motivated, in part, by Cooke’s criticism.

7. The early Bureau of Municipal Research had three directors, which may be contraryto the principle of hierarchy later advocated by Luther Gulick. Other contrasts between theNYBMR’s agenda and scientific management originate in the early NYBMR period. Thereader should not set too much store by the NYBMR’s apparent use of distributed leadership.It was, from its origins, associated with advocates of a strong executive in government,whether it followed this practice itself or not.

8. This survey is the settlement house social survey, not the survey of modern socialscience.

9. The random-sample survey of a population had not developed at this time; theseinterviews were, in part, the available alternative. However, the selected citizens were notintended to be representative; they were community leaders.

10. Commission government is a form of municipal government where a small numberof legislators are elected and each legislator serves as a department head. This form of gov-ernment is the antithesis of the strong executive government promoted by most governmentreformers of the era.

11. Publications 66 through 73 of the Wisconsin University Extension comprise thewhole citizen survey instrument, which is 199 pages in length.

12. These government studies could also be classified as government research (the firstcategory in this section). They are included in this category to emphasize the similarity ofapproach to information gathering.

13. Much of the work in surveys and scorecards was supported, at least in part, by theRussell Sage Foundation. Harriet Bartlett (1928) , in an early account of the social survey,ascribed this relationship to the fortuitous coincidence of timing: “The Russell Sage Founda-tion, founded just as the Pittsburgh survey was getting under way, gave it considerable sup-port” (p. 343).

14. Economic indexes are functionally similar to scorecards, but this research does notshow a common ancestor.

15. The wording of these labels is his.

158 ADMINISTRATION & SOCIETY / May 2004

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 30: Evolution of Performance Measurement

16. Aggregating cost per capita data in this manner is inaccurate, because it assigns toomuch weight to the preferred rank level for proportionately smaller service costs. It would bemore accurate to aggregate costs across service categories, then compute per capita costs,and then rank. Cramer appeared to have used his method to provide as many points to costs aswere available to service quality. Simple weighting adjustments would have been morereasonable.

17. For all the rest, he offered a scant 7 pages of text, which includes an introduction andconclusion, and 13 pages of appendixes that are abstracts of principal materials found in hisbibliography.

18. By 1929, the issue of per capita costs was anachronistic. Good quality cost account-ing methods were available, and cost per unit of service would have been more meaningful.Per capita costs are insensitive to the factors of service demand. Although they correctlyreflect the taxpayers’ experience, they do not effectively measure governmental efficiency.

19. Her discussion in footnote 11 reveals concerns for normalizing the data but ig-nores the fact that her adjustments change the order of results when averaged over multiplecategories.

20. In addition to the problems discussed here, no one seemed to take notice of itemsassigned zero weight, that is, that matters included in any scorecard, survey, or index repre-sented only a nonrandom sample of matters that could be included.

21. Brownlow wrote the forward to Clarence Ridley’s 1934 book on city management(Ridley & Nolting, 1934).

22. In addition, there is nothing in this literature that attends to possible data manipula-tion, as with watering garbage to increase apparent volume or minimizing the severity ofrecorded criminal complaints.

23. For a more extreme view, see “Are We Spending Too Much for Government?”(Vandegrift, 1927).

REFERENCES

Agnew, P. G. (1924). Results of standardization of supplies. Annals of the American Acad-emy of Political and Social Science, 113, 269-271.

Allen, W. H. (1907). Efficient democracy. New York: Dodd, Mead, & Company.Allen, W. H. (1917). The budget amendment of the Maryland Constitution. National Munici-

pal Review, 7, 485-491.Aronovici, C. (1910). Knowing one’s own town. Boston: American Unitarian Association.Aronovici, C. (1916). The social survey. Philadelphia: The Harper Press.A score card for West Virginia cities. (1923). National Municipal Review, 12(6), 336-338.Augier, M., & March, J. G. (2001). Remembering Herbert A. Simon (1916-2001). Public

Administration Review, 61(4), 396-402.Ayres, L. P. (1920). An index number for state school systems. New York: Russell Sage.Bachman, F. P. (1912). Attaining efficiency in city school systems. Annals of the American

Academy of Political and Social Science, XLI, 158-175.Barnum, C. L. (1924). Standardization of printed forms and stationery. Annals of the Ameri-

can Academy of Political and Social Science, 113, 286-291.Bartlett, H. (1928). The social survey and the charity organization movement. American

Journal of Sociology, 34(2), 330-346.

Williams / EVOLUTION OF PERFORMANCE MEASUREMENT 159

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 31: Evolution of Performance Measurement

Beard, C. A. (1923). The administration and politics of Tokyo. New York: Macmillan Co.Bills, M. A., & Dickenson, P. L. (1937). Measurement of staff output in clerical work. Public

Administration, XV(3), 261-268.Bouckaert, G. (1992). Public productivity in retrospective. In M. Holzer (Ed.), Public pro-

ductivity handbook (pp. 15-46). New York: Marcel Dekker, Inc.Bracy, W. L. (1924). Report on the survey of general civic conditions in Colorado cities and

towns having a population of 2,000 or over. In D. C. Sowers & W. L. Bracy (Eds.), Pro-ceedings of the second annual conference of the Colorado Municipal League (pp. 68-86).Boulder: Colorado Municipal League.

Bruere, H. (1912a). Efficiency in city government. Annals of the American Academy of Polit-ical and Social Science, XLI, 1-22.

Bruere, H. (1912b). The new city government. New York: D. Appleton & Co.Bruere, H. (1915). Development of standards in municipal government. Annals of the Ameri-

can Academy of Political and Social Science, 61, 199-207.Buck, A. E. (1924). Measuring the results of government. National Municipal Review, 13,

152-157.Buck, A. E., & Watson, W. (1926). Cost accounting. In A. E. Buck (Ed.), Municipal finance

(pp. 193-217). New York: The MacMillan Co.Bureau of Labor Statistics. (1928). Park recreation areas in the United States. Washington,

DC: Government Printing Office.Bureau of Municipal Research. (1907). Making a municipal budget: Functional accounts

and operative statistics for the Department of Health of Greater New York. New York:Bureau of Municipal Research.

Bureau of Municipal Research. (1916). The municipal research idea. Municipal Research,71(1), 1-8.

Bureau of Municipal Research of Philadelphia. (1929, July 23). Municipal batting averages.Citizens’ Business, 896, 1-4.

Burks, J. D. (1912). Efficiency standards in municipal management. National MunicipalReview, 1, 364-371.

Carey, W. D. (1946). Control and supervision of field officers. Public Administration Review,6(1), 20-24.

Chase, H. S. (1917). The budget amendment of the Maryland Constitution. National Munici-pal Review, 7, 395-398.

Clow, F. R. (1896). Suggestions for the study of municipal finance. Quarterly Journal of Eco-nomics, 10(4), 455-466.

Committee on Administrative Practice. (1926). Appraisal form for city health work. NewYork: American Public Health Association.

Committee on Uniform Crime Records of the International Association of Chiefs of Police.(1930). Uniform crime reports for the United States and its possessions. New York:Author.

Commons, J. R. (1908). Standardization of housing investigations. Publications of the Amer-ican Statistical Association, 11(84), 319-326.

Connell, W. H. (1912). Standardization of specifications for public works. Annals of theAmerican Academy of Political and Social Science, 41, 127-137.

Converse, J. M. (1987). Survey research in the United States roots & emergence 1890-1960.Berkeley: University of California Press.

Cooke, M. L. (1910). Academic and industrial efficiency (Bulletin No. 5). New York: Carne-gie Foundation.

160 ADMINISTRATION & SOCIETY / May 2004

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 32: Evolution of Performance Measurement

Cooke, M. L. (1913). The spirit and social significance of scientific management. Journal ofPolitical Economy, 21(6), 481-493.

Cooke, M. L. (1915). Scientific management of the public business. The American PoliticalScience Review, 9(3), 488-495.

Cooke, M. L. (1918). Our cities awake. New York: Doubleday, Page, & Co.Cooke, M. L. (1924). The influence of scientific management upon government—federal,

state and municipal. Bulletin Of The Taylor Society, 9(1), 31-38.Cottrell, E. A. (1925). Round table on municipal administration development of a method of

rating the relative efficiency of cities. The American Political Science Review, 19(1),150-155.

Cramer, E. H. (1929). A survey of the general civic conditions of Colorado cities having apopulation of 2,000 or more. Boulder, CO: Colorado Municipal League.

Cummings, J. (1913). The permanent Census Bureau: A decade of work. Publications of theAmerican Statistical Association, 13(104), 605-638.

Curtis, M. (1937). Measurement of staff output in clerical work. Public Administration,XV(3), 255-260.

Dahlberg, J. S. (1966). The New York Bureau of Municipal Research: Pioneer in governmentadministration. New York: New York University Press.

Deacon, W. J. V. (1926). The rating of Michigan cities. Michigan Public Health, 14(8),175-182.

Dunaway, J. A. (1916). Standardization and inspection, Part 1. The American Political Sci-ence Review, 10(2), 315-319.

Ehrenhalt, A. (1994). ASSESSMENTS, performance budgeting, thy name is . . . ; Old ideascloaked in the trappings of science are still old ideas. Governing Magazine, 8(2), 9-10.

Emmons, F. E. (1926). City school attendance service. New York: Bureau of Publications,Teachers College, Columbia University.

Fairlie, J. A. (1899). Comparative municipal statistics. Quarterly Journal of Economics,13(3), 343-353.

Fairlie, J. A. (1901). Municipal accounts and statistics in continental Europe. In C. W.Woodruff (Ed.), Rochester Conference for Good City Government and the seventh an-nual meeting of the National Municipal League (pp. 282-301). Rochester, NY: NationalMunicipal League.

Fairlie, J. A. (Ed.). (1908). Comparative municipal statistics. In Essays in Municipal Admin-istration (pp. 275-285). New York: The MacMillan Company.

Federal Council of Citizenship Training. (1924). Community scorecard. Washington, DC:Government Printing Office.

Fox, K. (1977). Better city government: Innovation in American urban politics, 1850-1937.Philadelphia: Temple University Press.

Garner, S. P. (1954). Evolution of cost accounting to 1925. Montgomery: University of Ala-bama Press.

Gill, N. N. (1944). Municipal research bureaus. Washington, DC: American Council on Pub-lic Affairs.

Goodnow, F. J. (1912). The limit of budgetary control. Proceedings of the American PoliticalScience Association, 9, 68-77.

Government Research Association. (1933). An annotated roster of the Governmental Re-search Association. Chicago: Author.

Gulick, L. (1928). The National Institute of Public Administration, a progress report. NewYork: The National Institute of Public Administration.

Williams / EVOLUTION OF PERFORMANCE MEASUREMENT 161

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 33: Evolution of Performance Measurement

Gulick, L. (1929). Wanted: A measuring stick for school systems. National MunicipalReview, 18, 3-5.

Gulick, L. (1981). Notes on a theory of organization. In F. Mosher (Ed.), Basic literature ofpublic administration 1787-1950 (pp. 149-173). New York: Holmes & Meier.

Haber, S. (1964). Efficiency and uplift: Scientific management in the progressive era1890-1920. Chicago: The University of Chicago Press.

Hammond, M. B. (1911). Recent efforts to advance freight rates. The American EconomicReview, 1, 766-789.

Hanger, G. W. W. (1901). Present conditions of municipal statistics in the United States. InC. R. Woodruff (Ed.), Rochester Conference for Good City Government and the seventhannual meeting of the National Municipal League (pp. 264-277). Rochester, NY:National Municipal League.

Hansen, I. M., & Wheeler, H. L. (1927). Statistics of American libraries. Library Journal, 52,511-525.

Hartwell, E. M. (1901). Report of the Committee on Uniform Municipal Accounting. In C. R.Woodruff (Ed.), Rochester Conference for Good City Government and the seventhannual Meeting of the National Municipal League (pp. 248-263). Rochester, NY:National Municipal League.

How one city scored itself. (1923). National Municipal Review, 12(4), 163-164.Hudson, R. M. (1926, October 4). Making a town livable, a suggestion for greater stability.

National Real Estate Journal, pp. 17-24.Interstate Commerce Commission. (1911). Evidence taken by the Interstate Commerce

Commission in the matter of proposed advances in freight rates by carriers. Docket no.3400 (Eastern Case); Docket No. 3500 (Western Case). U.S. Interstate Commerce Com-mission. Washington, DC: Government Printing Office.

Kahn, J. (1997). Budgeting democracy: State building and citizenship in America,1890-1928. Ithaca, NY: Cornell University Press.

Klein, O. J. (1912). Securing efficiency through a standard testing laboratory. Annals of theAmerican Academy of Political and Social Science, 41, 93-102.

Lester, H. (1924). Report upon classification of crimes. Journal of the American Institute ofCriminal Law and Criminology, 14, 593-896.

Mark, M. L. (1916). Report of social survey of Portsmouth, Ohio. Columbus: Ohio Institutefor Public Efficiency.

Martin, H. M., Davis, S. D., & Keppel, N. M. (1926). Statistics of city libraries, 1926. LibraryJournal, 51, 1115-1119.

Maxey, C. C. (1919). A little history of pork. National Municipal Review, 8, 691-705.Measuring government efficiency (1924). National Municipal Review, 13(9), 534-535.Metcalfe, H. (1885). The cost of manufactures and the administration of workshops, public

and private. New York: John Wiley.Meyer, E. C. (1910). The National Census Bureau and our cities. Proceedings of the Ameri-

can Political Science Association, 7, 126-137.Munro, W. B. (1926). The government of American cities. New York: The Macmillan

Company.Neustadt, R. E. (1963). Approaches to staffing the presidency: Notes on FDR and JFK. The

American Political Science Review, 57, 855-864.Ogburn, W. F. (1917). A statistical study of American cities by students of Reed College. Port-

land, OR: Reed College Record.Palmer, G. T. (1926). The scoring of city health work. Michigan Public Health, 14(4), 75-83.

162 ADMINISTRATION & SOCIETY / May 2004

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 34: Evolution of Performance Measurement

Palmer, G. T., Platt, P. S., Walker, W. F., Nicoll, A. J., & Jablonower, A. (1925). A health sur-vey of 86 cities. New York: American Child Health Association.

Petroski, H. (1992). The evolution of useful things. New York: Knopf.Porter, T. M. (1986). The rise of statistical thinking 1820-1900. Princeton, NJ: Princeton Uni-

versity Press.Previts, G. J., & Merino, B. D. (1979). A history of accounting in America—an histori-

cal interpretation of the cultural significance of accounting. New York: John Wiley.Pultz, J. L. (1912). Economy and efficiency in the Department of Water Supply, Gas and

Electricity, New York City. Annals of the American Academy of Political and Social Sci-ence, XLI, 78-85.

Quigley, J. M. (1925). Annual report of the Police Bureau. Rochester, NY: Department ofPublic Safety.

Richards, E. G. (1915). The experience grading and rating schedule—designed to be aUnited States standard for measuring fire insurance costs based upon combined experi-ence averages. New York: National Board of Fire Underwriters.

Ridley, C. E. (1927a). Means of measuring municipal government. Unpublished doctoraldissertation, Syracuse University, Syracuse, New York.

Ridley, C. E. (1927b). Measuring municipal government. New York: Municipal Administra-tion Service & School of Citizenship & Public Affairs, Syracuse University.

Ridley, C. E., & Nolting, O. F. (1933). How cities can cut costs. Chicago: The InternationalCity Managers’ Association.

Ridley, C. E., & Nolting, O. F. (1934). The city manager profession. Chicago: University ofChicago Press.

Ridley, C. E., & Simon, H. (1938). Measuring municipal activities. Chicago: The Interna-tional City Managers’ Association.

Ridley, C. E., & Simon, H. (1943). Measuring municipal activities. Chicago: The Interna-tional City Managers’ Association.

Ridley, C. E., & Simon, H. (1948a). Measuring municipal activities. Chicago: The Interna-tional City Managers’ Association.

Ridley, C. E., & Simon, H. (1948b). Specifications for the annual municipal report. Chicago:The International City Managers’ Association.

Roberts, A. (1994). Demonstrating neutrality: The Rockefeller philanthropies and the evolu-tion of public administration, 1927-1936.Public Administration Review, 54(3), 221-238.

Rosenberg, H. H. (1948). Can work measurement be applied to the personnel office? PublicAdministration Review, 8(1), 41-48.

Schachter, H. L. (1997). Reinventing government or reinventing ourselves. Albany: StateUniversity of New York Press.

Schneider, F., Jr. (1916). Relative values in public health work. New York: Russell Sage.Scott, W. G. (1924). Results of the Pennsylvania plan for standardization and purchasing sup-

plies. Annals of the American Academy of Political and Social Science, 113, 298-305.Sklar, K. K. (1991). Hull-House maps and papers: Social science as women’s work in the

1890s. In M. Blumer, K. Bales, & K. K. Sklar (Eds.), The social survey in historical per-spective, 1880-1940 (pp. 115-129). Cambridge, UK: Cambridge University Press.

Smith, B. (1929). A guide for preparing annual police reports. New York: Committee onUniform Crime Records International Association of Chiefs of Police.

Stigler, S. M. (1986). The history of statistics: The measurement of uncertainty before 1900.Cambridge, MA: The Belknap Press of Harvard University Press.

Williams / EVOLUTION OF PERFORMANCE MEASUREMENT 163

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 35: Evolution of Performance Measurement

Stigler, S. M. (1999). Statistics on the table: The history of statistical concepts and methods.Cambridge, MA: Harvard University Press.

Stivers, C. (2000). Bureau men, settlement women: Constructing public administration in theprogressive era. Lawrence: University Press of Kansas.

Strayer, G. D., Engelhardt, N. L., & Elsbree, W. S. (1927). Standards for the administrationbuilding of a school system. New York: Bureau of Publications Teachers College, Colum-bia University.

Taussig, B. J. (1912). Results obtainable through reorganization of accounting methods.Annals of the American Academy of Political and Social Science, XLI, 57-63.

Taylor, F. (1947/1903). Shop management. New York: Harper & Brothers.Treleven, J. E. (1912). The Milwaukee Bureau of Economy and Efficiency. Annals of the

American Academy of Political and Social Science, XLI, 270-279.Upson, L. D. (1922). Increasing activities and increasing costs. National Municipal Review,

11(10), 317-320.Upson, L. D. (1923). The other side of the budget. National Municipal Review, 12, 119-122.Upson, L. D. (1924a). Round table V. Political statistics. The American Political Science

Review, 18(1), 146-148.Upson, L. D. (1924b). The Government of Cincinnati and Hamilton County: A report to the

Republican Executive and Advisory Committee of Hamilton County. Cincinnati, OH:City Survey Committee.

Upson, L. D. (1926). Practice of municipal administration. New York: Century Co.U.S. Public Health Service. (1926). Municipal Health Department practice for the year 1923

based upon surveys of the 100 largest cities in the United States. Washington, DC: Gov-ernment Printing Office.

Vandegrift, R. A., (1927). Are we spending too much for government? National MunicipalReview, 16, 526-535.

Waldo, D. (1948). The administrative state. New York: The Ronald Press.Walker, H. (1926a). Grading of fire departments to determine fire insurance rates. Minnesota

Municipalities, 11, 447-451.Walker, H. (1926b). Problems of determining fire insurance rates. Minnesota Municipalities,

11, 605-607.Walker, H. (1926c). The classification of cities and villages for determining fire insurance

rates. Minnesota Municipalities, 11, 312-320.Walker, H. (1926d). The grading of water supplies to determine fire insurance rates. Minne-

sota Municipalities, 11, 406-413.Walker, M. L. (1929). Rating cities according to the services which their citizens are getting.

The American City, 41, 130-134.Walker, M. L. (1930). Municipal expenditures. Baltimore: Johns Hopkins University Press.Welton, B. F. (1912). The problem of securing efficiency in municipal labor. Annals of the

American Academy of Political and Social Science, XLI, 103-114.White, L. D. (1948). The Federalists; A study in administrative history 1789-1801. New

York: Free Press.White, L. D. (1951). The Jeffersonians; A study in administrative history 1801-1829. New

York: Free Press.White, L. D. (1954). The Jacksonians; A study in administrative history 1829-1861. New

York: Free Press.White, L. D. (1958). The Republican era; A study in administrative history 1869-1901. New

York: Free Press.

164 ADMINISTRATION & SOCIETY / May 2004

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from

Page 36: Evolution of Performance Measurement

Wilcox, W. F. (1896). Methods of determining the economic productivity of municipalities.American Journal of Sociology, 2(3), 378-391.

Williams, D. W. (2002). Before performance measurement. Administrative Theory andPraxis, 24(6), 457-486.

Williams, D. W. (2003). Measuring government in the early twentieth century. PublicAdministration Review, 63(6), 643-659.

Williams, H. (1927). The scoring of fifty-seven New York State cities. American Journal ofPublic Health, 17, 584-587.

Willoughby, W. F. (1910). The correlation of financial and physical statistics of cities. InC. R. Woodruff (Ed.), Buffalo Conference of the National Municipal League (pp.203-213). Buffalo, NY: National Municipal League.

Wisconsin Conference of Social Work. (1927). Citizen’s survey measurement standards forcommunity activities town planning and zoning. Madison: University of Wisconsin.

Woodruff, C. R. (Ed.). (1910). The new municipal idea. In Buffalo Conference of theNational Municipal League (pp. 22-102). Buffalo, NY: National Municipal League.

Daniel W. Williams has taught at the Baruch School of Public Affairs since 1995.Before that, he was the budget director for the Virginia Department of Medical Assis-tance Services. Other articles include “Reinventing the Proverbs of Government,”Public Administration Review (2000); “Before Performance Measurement,”Administrative Theory and Praxis (2002); and “Measuring Government in the EarlyTwentieth Century,” Public Administration Review (2003).

Williams / EVOLUTION OF PERFORMANCE MEASUREMENT 165

at Cape Breton University Library on December 5, 2008 http://aas.sagepub.comDownloaded from