16
REVIEW ARTICLE SPATIAL INTERACTION DATA ABSTRACT. The lack of good flow data is a handicap to spatial interaction re- search, yet many published works provide little evaluation of such data. Good quality flow data should provide spatial coverage at a large scale with small sampling and other error components. Few generally available data series for interregional commodity flows, interregional population migration, and intercity person move- ment in the United States meet these basic requirements. The data sets of the Bureau of the Census, Department of Transportation, Social Security Administra- tion, and other government agencies do not provide a sound empirical foundation for spatial interaction analysis at this time. KEY WORDS: Commodity Flows, Migration, Nonmigratory movements of people, Origin-destination data, Population migration, Spatial interaction, United States. UCH has been written about the concepts, M models, and tools of spatial interaction analysis, yet an understanding of the flows of many phenomena rests on a very weak empiri- cal base. Empirical information for the study of interregional or interurban flows is neither great in quantity nor generally sound in quality.’ Teachers and scholars in demography, eco- nomics, geography, planning, regional science, and transportation are dissatisfied with available flow data.2 Concern has been expressed in state Accepted for publication 13 June 1974. 1 The paper has been restricted to interregional commodity movements, population migration, and nonmigratory movements of people within the United States, although I surveyed other types of flows and international movements. 2 Most of the standard pedagogic works in these fields devote little space to spatial interaction, and on the research plane there has been more discussion of concepts, models, and application of quantitative tools than of the more mundane topic of data. Some publications in which flow data are of central interest include B. J. L. Berry et al., Essays on Commodity Flows and the ,Tpatial Structure of the Indian Econ- omy, Research Paper No. 111 (Chicago: University of Chicago, Department of Geography, 1966); J. N. H. Britton, Regional Analysis and Economic Geog- raphy (London: G. Bell and Sons, 1967); P. O’Sul- livan, Transport Networks and the Irish Economy, Geographical Papers No. 4 (London: London School of Economics, 1969); A. Pred, “Toward a Typology of Manufacturing Flows,” Geographical Review, Vol. 54 (1964), pp. 65-84; W. E. Reed, Areal Interaction in India, Research Paper No. 110 (Chicago: Univer- sity of Chicago, Department of Geography, 1967) ; P. J. Schwind, “Spatial Preferences of Migrants for and federal governments, and in private indus- try and commerce, over the paucity and limita- tions of data on the movement of commodities and pe~ple.~ Regions: The Example of Maine,” Proceedings, As- sociation of American Geographers, Vol. 3 (1971), pp. 150-56; R. N. Taaffe, “Interregional Passenger Movement in the U.S.S.R.,” East Lakes Geographer, Vol. 3 (1967), pp. 47-79; and A. V. Williams and W. Zelinsky, “On Some Patterns in International Tourist Flows,” Economic Geography, Vol. 46 3 As reflected in H. 0. Whitten, ed., Transport Flow Data, Proceedings of the National Transporta- tion Flow Statistics Forum, June 27-29, 1968 (Wash- ington, D. C.: Transportation and Logistics Institute, School of Business Administration, The American University, 1968); and in several government docu- ments of the 1960s: Round-Table Discussion on Federal Transportation Statistics (Washington, D. C.: Federal Statistics Users Conference, September 1961 ) ; Conference on Transportation Research (Woods Hole, Massachusetts: National Academy of Science-Na- tional Research Council, August 1960); House Re- port No. 1700, House Committee on Post Office and Civil Service, Subcommittee on Census and Govern- ment Statistics, 87th Congress, 2nd Session, Improv- ing Federal Transportation Statistics, May 17, 1962; House Report 89-17, House Committee on Interstate and Foreign Commerce, Subcommittee on Transporta- tion and Aeronautics, 89th Congress, 1st Session, Commerce Department Transportation Research, HR 5863, June 30, 1965; culminating in a 1969 proposal by the then newly created Department of Trans- portation for a transportation information program: Dcpartment of Transportation, Transportation Znfor- mation-A Report to the Committee on Appropria- tions, U. S. House of Representatives, from the Secretary of Transportation (Washington, D. C.: U. S. Department of Transportation, 1969) ; this report also contains a comprehensive survey of data. Other (1970), pp. 549-67. ANNALS OF THE ASSOCIATION OF AMERICAN GEOGRAPHERS 0 1974 by the Association of American Geographers. Vol. 64, No. 4, December 1974 Printed in U.S.A. 5 60

SPATIAL INTERACTION DATA

Embed Size (px)

Citation preview

Page 1: SPATIAL INTERACTION DATA

REVIEW ARTICLE

SPATIAL INTERACTION DATA

ABSTRACT. The lack of good flow data is a handicap to spatial interaction re- search, yet many published works provide little evaluation of such data. Good quality flow data should provide spatial coverage at a large scale with small sampling and other error components. Few generally available data series for interregional commodity flows, interregional population migration, and intercity person move- ment in the United States meet these basic requirements. The data sets of the Bureau of the Census, Department of Transportation, Social Security Administra- tion, and other government agencies do not provide a sound empirical foundation for spatial interaction analysis at this time. KEY WORDS: Commodi ty Flows, Migration, Nonmigratory movements of people, Origin-destination data, Population migration, Spatial interaction, United States.

UCH has been written about the concepts, M models, and tools of spatial interaction analysis, yet an understanding of the flows of many phenomena rests on a very weak empiri- cal base. Empirical information for the study of interregional or interurban flows is neither great in quantity nor generally sound in quality.’ Teachers and scholars in demography, eco- nomics, geography, planning, regional science, and transportation are dissatisfied with available flow data.2 Concern has been expressed in state

Accepted for publication 13 June 1974.

1 The paper has been restricted to interregional commodity movements, population migration, and nonmigratory movements of people within the United States, although I surveyed other types of flows and international movements.

2 Most of the standard pedagogic works in these fields devote little space to spatial interaction, and on the research plane there has been more discussion of concepts, models, and application of quantitative tools than of the more mundane topic of data. Some publications in which flow data are of central interest include B. J. L. Berry et al., Essays on Commodity Flows and the ,Tpatial Structure of the Indian Econ- omy, Research Paper No. 11 1 (Chicago: University of Chicago, Department of Geography, 1966); J . N. H. Britton, Regional Analysis and Economic Geog- raphy (London: G. Bell and Sons, 1967); P. O’Sul- livan, Transport Networks and the Irish Economy, Geographical Papers No. 4 (London: London School of Economics, 1969); A. Pred, “Toward a Typology of Manufacturing Flows,” Geographical Review, Vol. 54 (1964), pp. 65-84; W. E. Reed, Areal Interaction in India, Research Paper No. 110 (Chicago: Univer- sity of Chicago, Department of Geography, 1967) ; P. J. Schwind, “Spatial Preferences of Migrants for

and federal governments, and in private indus- try and commerce, over the paucity and limita- tions of data on the movement of commodities and p e ~ p l e . ~

Regions: The Example of Maine,” Proceedings, As- sociation of American Geographers, Vol. 3 (1971), pp. 150-56; R. N. Taaffe, “Interregional Passenger Movement in the U.S.S.R.,” East Lakes Geographer, Vol. 3 (1967), pp. 47-79; and A. V. Williams and W. Zelinsky, “On Some Patterns in International Tourist Flows,” Economic Geography, Vol. 46

3 As reflected in H. 0. Whitten, ed., Transport Flow Data, Proceedings of the National Transporta- tion Flow Statistics Forum, June 27-29, 1968 (Wash- ington, D. C.: Transportation and Logistics Institute, School of Business Administration, The American University, 1968); and in several government docu- ments of the 1960s: Round-Table Discussion on Federal Transportation Statistics (Washington, D. C.: Federal Statistics Users Conference, September 1961 ) ; Conference on Transportation Research (Woods Hole, Massachusetts: National Academy of Science-Na- tional Research Council, August 1960); House Re- port No. 1700, House Committee on Post Office and Civil Service, Subcommittee on Census and Govern- ment Statistics, 87th Congress, 2nd Session, Improv- ing Federal Transportation Statistics, May 17, 1962; House Report 89-17, House Committee on Interstate and Foreign Commerce, Subcommittee on Transporta- tion and Aeronautics, 89th Congress, 1st Session, Commerce Department Transportation Research, HR 5863, June 30, 1965; culminating in a 1969 proposal by the then newly created Department of Trans- portation for a transportation information program: Dcpartment of Transportation, Transportation Znfor- mation-A Report to the Committee on Appropria- tions, U . S. House of Representatives, f r o m the Secretary of Transportation (Washington, D. C.: U. S. Department of Transportation, 1969) ; this report also contains a comprehensive survey of data. Other

(1970), pp. 549-67.

ANNALS OF THE ASSOCIATION OF AMERICAN GEOGRAPHERS 0 1974 by the Association of American Geographers.

Vol. 64, No. 4, December 1974 Printed in U.S.A.

5 60

Page 2: SPATIAL INTERACTION DATA

1974 REVIEW ARTICLE 561

Data requirements pertain to two compo- nents. First, it is necessary to measure flow magnitude, involving decisions about units of measurement, and units of observation, such as family, household, or individual. Flow magni- tude may be single-place specific (the volume of movement generated by or terminated at a particular location) or dyad-specific (the flow volume between a pair of locations). Dyad- specific information must be available for spa- tial interaction studies. In the second place, needed locational information includes the se- lection of a scale, the delimitation of areal units, and the choice of areal statistical units.4 Such

synopses and discussion of data are: F. Hendrix, “Federal Transport Statistics: An Analysis,” Trans- portation Journal, Vol. 5 (1965), pp. 5-15; D. B. Holleb, Social and Economic Information for Urban Planning (Chicago: Center for Urban Studies, Uni- versity of Chicago, 1969), Vol. 1 , pp. 94-98, and Vol. 2, pp. 65-100; and M. L. Rose, “Some Problems and Prospects in Collecting Data on Travel Demand,” in Mathematica, Studies in Travel Demand (Prince- ton, New Jersey: Mathematica, Inc., 1966-69), Vol. 1, pp. 134-56.

Difficulties arise over the choice of areal statistical units in an origin-destination matrix. Even if informa- tion is collected for small units, such as street ad- dresses or railroad stations, it is usually made avail- able only for larger units. Some form of compromise is thus imposed on the data user who in most cases is unable to select the best unit of analysis. An admin- istrative bias exists in the areal statistical units, be- cause different agencies have their own geocoding schemes. A recent review of many schemes of public and private organizations is P. A. Werner, “National Geocoding,” Annals, Association of American Geog- raphers, Vol. 64 (1974), pp. 310-17. The question of efficient spatial units has been asked in only a few instances: B. J. L. Berry, Metropolitan Area Defini- tion: a Re-evaluation of Concept and Sfatistical Prac- tice, Working Paper No. 28 (Washington, D. C.: U. S. Bureau of the Census, 1968); J. Wolpert, “The Basis for Stability of Interregional Transactions,” Geographical Analysis, Vol. 1 (1969), pp. 152-80; and P. J. Schwind, Migration and Regions/ Develop- ment in the United States, 1950-1960, Research Paper No. 133 (Chicago: University of Chicago, Depaitment of Geography, 1971). Some general observations on this topic are provided by: T. Hermansen, “Informa- tion Systems for Regional Development Planning: Issues and Problems,” in T. Hagerstrand and A. R. Kuklinski, eds., Information Systems f o r Regional Development-a Seminar, Lund Studies in Geography, Series B, No. 37 (Lund: The Royal University of Lund, Sweden, Department of Geography, 197 1 ),

Functional regions are preferable to formal or ar- bitrary regions for spatial interaction studies. The determination of the functional regions presupposes that good flow data are available. Only a small number

pp. 1-37.

aspects are all usually part of the research de- sign for a specific study, but areal units and measurement of magnitudes will affect the con- clusions if general purpose data are used.

It may appear superfluous to state that data must be appropriate for the task at hand, but it appears that whatever data were readily avail- able have been used without sufficient controls, or with no commentary on their q ~ a l i t y . ~

Good quality data have the following charac- teristics: 1 ) the universe from which the sample is drawn should be known, and details should be provided for the user; 2) the sampling design and sampling variability statistics should be known, and the sampling variability should not be large; 3 ) it should be possible to separate sampling error from error arising from other causes, and other error should be known and small; 4) definitions of terms and concepts should be clear and explained in sufficient de- tail; and 5 ) definitions and items of data ought to be compatible and not subject to variations in time and space when data from different sources are used.6

of efforts have been directed to this need: E. J . Taaffe, “The Urban Hierarchy: an Air Passenger Definition,” Economic Geography, Vol. 38 (1962), pp. 1-14; B. J . L. Berry, “Interdependency of Spatial Structure and Spatial Behavior: a General Field Theory Formula- tion,” Papers of the Regional Science AAsociation, Vol. 21 (1968), pp. 205-27; J. R. Borchert, “Amer- ica’s Changing Metropolitan Regions,” Annals, As- sociation of American Geographers, Vol. 62 (1972), pp. 352-73; L. A. Brown, J. Odland, and R. G. Goll- edge, “Migration, Functional Distance, and the Urban Hierarchy,” Economic Geography, Vol. 46 (1970), pp. 472-85; L. A. Brown and J. Holmes, “The De- limitation of Functional Regions, Nodal Regions, and Hierarchies by Functional Distance Approaches,” Journal of Regional Science, Vol. 11 ( 1971 ), pp. 57- 72; and Schwind, op. cit., footnote 4.

*> Few spatial interaction studies provide any re- view or evaluation of data problems. Furthermore, recent pedagogical articles on spatial interaction make practically no reference to data needs or problems: D. K. Fleming, “Spatial Interaction,” in P. Bacon, ed., Focus on Geography (Washington, D. C . : National Council for Social Studies, 1970), pp. 147-72; and P. Gersmehl, “Spatial Interaction,” Journal of Geog- raphy, Vol. 69 (1970), pp. 552-80. Overviews of transportation geography have no comments on data: P. 0. Muller, “Recent Developments in the Spatial Analysis of Transportation,” Pennsylvania Geog- rapher, Vol. 9, No. 4 (1971), pp. 14-17; and J . 0. Wheeler, “An Overview of Research in Transporta- tion Geography,” East Lakes Geographer, Vol. 7

6 Such matters in the context of transportation data are discussed by W. W. Deming, “Pitfalls in Statis-

(1971), pp. 3-12.

Page 3: SPATIAL INTERACTION DATA

5 62 REVIEW ARTICLE December

The total set of flows is the universe of a sample for spatial interaction data. Bias in the data has to be considered if the data are derived from other than the universe of flows. One must know if the sampling design incorporates a spatial element. Good quality spatial interaction data are based on a uniform and reliable cover- age of all areal units, and are accompanied by information on sampling and other errors. One must have data for the magnitude of flows within the areal statistical units if the total vol- ume of spatial interaction has to be determined. Good data will also reveal the true origins and destinations of the movement.

Unfortunately the desire for detailed spatial coverage generally increases the cost of data collection and preparation, and may also in- fringe on the confidentiality of in f~rmat ion .~ The common practice of using an areally clustered sample to obtain data for a national universe in order to cut data collection costs prevents good spatial coverage. Detailed, re- liable spatial coverage is usually obtainable only at quite high cost.

INTERREGIONAL COMMODITY FLOWS

The specific data requirements for the study of interregional commodity flows depend upon the objectives of the user, but certain general needs may be distinguished. It is useful to think in terms of commodity characteristics, trans- portation characteristics, and spatial charac- teristics. I will give attention only to the last of these. The task is to obtain information on different commodities for all cells of a carefully

tical Sampling,” in Whitten, footnote 3, pp. 61-69; A. C. Rosander, “Obtaining Acceptable Quality Data from Carload Waybill and Other Samples,” Highway Research Record, Vol. 82 (1965), pp. 114-20; and Rose, op. cit., footnote 3.

7 The data requirements for spatial interaction anal- ysis greatly exceed those of areal differentiation studies for the same number of items. For example, a matrix of county-to-county population migration for one group of people contains nine million cells, a complete tabulation of the flows among the 50,000 rail stations in the country requires a matrix of 2,500 million cells, and a dyadic matrix for fifty states for eight commodity groups has 200,000 cells of information. Many compromises are necessary to balance the different attitudes and needs of data provider and user. Much of the skill involved in obtaining good quality data at reasonable cost lies in establishing sound sampling practices on the part of the supplier; it is hopefully an accepted principle of the user that the quantity of data is less important than the quality.

designed origin-destination matrix so that regional economic dependency as reflected in movements of commodities can be analyzed.

Good data are lacking, yet surprisingly limited use has been made of available statistics. Unlike Smith, I feel that empirical poverty rather than conceptual poverty is a keynote of commodity flow studies.x Concepts, tools, and models have been developed beyond the capa- bilities of the data, although there have been several empirical studies of commodity move- ments in the United States, and commodity flow data have been used to illustrate technical mat- ters, illustrate concepts, and test hypotheses.9

R. H. T. Smith, “Concepts and Methods in Com- modity Flow Analysis,” Economic Geography, Vol. 46 (1970), pp. 404-16.

Models for predicting intercity commodity move- ments are discussed by J. R. Meyer and M. R. Strasz- heim, “Forecasting Demands for Intercity Freight Transport,” in J. R. Meyer, ed., Techniques of Trans- port Planning, Vol. 1 : Pricing and Project Evaluation (Washington, D. C. : Brookings Institution, 197 1, 2 Vols.), pp. 165-84, yet there is little mention of the means of calibrating these models. Smith, op. cit., footnote 8, omits discussion of data in his overview of commodity flow studies. Some empirical studies using commodity flow data are: P. A. Groves, Towards a Typology of Internzetropolitan Manufacturing Loca- tion, Occasional Papers in Geography, No. 16 (Hull: University of Hull, 1971); D. J . Patton, “The Traffic Pattern on American Inland Waterways,” Economic Geography, Vol. 32 (1956), pp. 29-37; E. D. Perle, The Demand for Transportation: Region and Com- modity Studies in the United States, Research Paper No. 45 (Chicago: University of Chicago, Department of Geography, 1964); R. L. Pfister, “The Terms of Trade as a Tool for Regional Analysis,” Journal of Regional Science, Vol. 3 (1961), pp. 57-66; S . Spiegelglas, “Some Aspects of State-to-State Com- modity Flows in the United States,” Journal of Re- gional Science, Vol. 2 (1960), pp. 71-80; E. L. Ullman, American Commodity Flow (Seattle: Uni- versity of Washington Press, 1957); and W. H. Wal- lace, “Freight Traffic Functions of Anglo-American Railroads,” Annals, Association of American Geog- raphers, Vol. 53 (1963), pp. 312-31.

Papers using commodity flow data to illustrate quantitative techniques are: W. R. Black, “Toward a Factorial Ecology of Flows,” Economic Geography, Vol. 49 (1973), pp. 59-67; K. R. Cox, “The Applica- tion of Linear Programming to Geographic Problems,” Tidjschrift voor Economische en Sociale Geografe, Vol. 56 (1965), pp. 228-36; L. N. Moses, “The Stabil- ity of Interregional Trading Patterns and Input-Output Analysis,” American Economic Review, Vol. 45 (1955), pp. 803-26; W. S. Peters, “Measures of Re- gional Interchange,” Papers of the Regional Science Association, Vol. 11 (1963), pp. 285-94; R. H. T. Smith, “Toward a Measure of Complementarity,” Economic Geography, Vol. 40 (1964), pp. 1-8; and

Page 4: SPATIAL INTERACTION DATA

1974 REVIEW ARTICLE 563

Census of Transportation An absence of data is not entirely the cause

of this empirical poverty. Origin-destination data series are prepared by several departments of the United States government. The Bureau of the Census regularly publishes data on the movement of manufactured goods in the census of transportation. Most of the material for 1963, 1967, and 1972 is for nonspatial items.ln Flow data are published for shipper groups (com- binations of industry types) and for eight com- modities in nine census divisions, twenty states, and twenty-five major manufacturing areas known as production areas.ll There are flow matrices for tons and ton-miles for numerous

R. Vining, “Delimitation of Economic Areas: Statis- tical Conceptions in the Study of the Spatial Structure of an Economic System,” Journal of the American Statistical Association, Vol. 48 (1953), pp. 44-64. A few studies with a conceptual or model building thrust are: W. R. Black, “The Utility of the Gravity Model and Estimates of its Parameters in Commodity Flow Studies,” PI oceedings, Association of American Geog- raphers, Vol. 3 (1971), pp. 28-32; W. R. Black, “In- terregional commodity Flows: Some Experiments with the Gravity Model,” Journal of Regionul Science, Vol. 12 (1972), pp. 107-18; A. L. Olson, “A Method for Estimating Regional Redistributions of Economic Activity,” Papers of the Regional Science Association, Vol. 28 (1972), pp. 181-87; Pred, op. cit., footnote 2; R. Riefler and C. M. Tiebout, “Interregional Input- Output: An Empirical California-Washington Model,” Journal of Regional Science, Vol. 10 (1970), pp. 135- 52; and C. S. Shaw, Interregional Trade and Prnduc- tion Scheduling, Vol. 1 of SEPS Reports, Central Economic and Demographic Study fo r New York State (Washington, D. C.: Center for Economic Projections, Economic Programming Center, National Planning Association, 1970).

10 Bureau of the Census, Census of Transportation, 1963 (Washington, D. C.: U. S. Department of Com- merce, Bureau of the Census, 1965); Bureau of the Census, Census of Transportation, 1967, Vol. 111, Commodity Trunspor tation Survey (Washington, D. C.: U. S. Department of Commerce, Bureau of the Census, 1970); and Bureau of the Census, Census of Transportation, 1972, Vol. 111, Commodity Trans- portation Survey (Washington, D. C.: U. S. Depart- ment of Commerce, Bureau of the Census, forth- coming).

11 The production area is a single SMSA or group of SMSAs. Census of Transportation, 1967, op. cit., footnote 10, has a list of the areas for 1967. The areas were selected mostly on the basis of their sig- nificance as manufacturing centers, as revealed by 1958 census data on the number of firms, manufactur- ing employment, and value added, but Denver and Seattle were chosen on grounds of geographical loca- tion.

commodity classes for states and production areas. The flow figures are given as percentage distributions for up to five-digit commodity categories where the sampling error is less than fifty percent for tons originated, and where there is no likelihood of disclosure of confidential in- formation.

Response and processing errors are not great in this data series, but sampling error averages about thirty percent for many areas and com- modities (Table Error factors are pro- vided for production areas by commodity for the total amount of traffic originated for the origin- destination tabulations. The sampling errors are not given for flows between places. Many of the flow values probably have very high sampling errors; single-place-specific error factors are quite high for many commodities even at the three-digit level. The sample design does not include any geographical ~tratificati0n.l~ There is no guarantee that every part of the country is covered adequately, and areas with small amounts of traffic will have higher than aver- age sampling errors. It is not possible to obtain a simple relationship between sample and uni- verse because the sample is based on manu- facturing firms rather than on the universe of f l 0 ~ s . I ~

These data series have two kinds of incom- plete coverage. Certain types of commodity flows are completely omitted.15 All flows from nonmanufacturing activities, flows by pipeline, and commodity movements from ordnance, fluid milk, bakery products, manufactured ice, primary forest products, printing and publishing

l 2 A complete list of sampling errors for three-digit commodity categories is given in Census of Transpor- tation, 1967, op. cit., footnote 10, Table A.

13This point is raised by J. P. Crecine et al., “The Census of Transportation: An Evaluation,” Papers, 7th Annual Meeting, Transportation Research Forum, 1966, pp. 97-106, which also provides a critique of the whole sampling procedure. For details of the sample design see D. E. Church, Sample Design, Commodity Transportation Survey, 1967 Census of Transportation (Washington, D. C.: U. S. Department of Commerce, Bureau of the Census, 1968). The sample, briefly, consists of a systematic selection of bills of lading from an industrially stratified sample of manufacturing plants. The rate of sampling is twice as high for production areas as for other areas.

14 Crecine et al., op. cit., footnote 13. 1”The commodity omissions reflect the policy of

the Census Bureau to obtain and publish transporta- tion data which are not already available from other sources.

Page 5: SPATIAL INTERACTION DATA

5 64 REVIEW ARTICLE December

TABLE 1 .-ILLUSTRATION OF SAMPLING ERRORS FOR COMMODITIES, 1967

Thousands Sampling tons variability

Commodity group Description originated (percentage)

Least variability 371 Motor vehicles and motor

3 3 1 Steel works and rolling

262 Paper, except building

vehicle equipment 31,312 3.2

mill products 112,167 4.0

Paper 18,969 4.8

Most variability 379 Miscellaneous transportation

equipment 1,059 27.0 209 Miscellaneous food

preparations 39,714 28.2 288 Yarn and thread 3,849 34.2

Source: Table A, Biireau of the Census, Census of Transportation, 1967, Vol. 111, Commodity Transporta- tion Survey (Washington, D. C.: U. S . Department of Commerce, Bureau of the Census, 1970) .

plants, and from plants with fewer than twenty employees are excluded.IG Although small plants account for about two-thirds of all manufacturing plants in the nation, the error arising from their exclusion is not great, because they account for only five percent of employ- ment or value added, and ship mainly to local areas, but a large volume of flow is omitted by excluding pipeline traffic and goods of non- manufacturing industry.

Incomplete geographical coverage is the major drawback to the published statistics of this data source. Some short distance shipments are omitted, but the distance range is not clear, nor is the volume of shipments excluded known.” The error involved in estimating intra- area flows is consequently indeterminate, but it is probably much larger than for interarea flows. There is also some error in measuring distance

IGThis applies both to 1967 and to 1972 censuses, but special reports are available for plants with ten to twenty employees: Bureau of the Census, Census of Transpoi tation, 1967, Trafjic Patterns of Small Manu- facturing Plants (Washington, D. C.: U. S. Depart- ment of Commerce, Bureau of the Census, 1970); and for flows in the printing and publishing industry: Bureau of the Census, Census of Transportation, 1967, Printing, Publishing and Allied Industries (Washing- ton, D. C.: U. s. Department of Commerce, Bureau of the Census, 1970). Similar reports will be published for the 1972 census.

17 Manufacturers were allowed to define a local shipment, and the census publication does not pro- vide any clue to its distance, but it appears to be on the order of twenty-five miles (forty km); 1963 In- terregional Commodity Trade Flow Estimates (Silver Spring, Maryland: Faucett Associates, 1971), p. 19.

of shipment because the generally used straight- line measure is only about five-sixths of the direct highway route or short railroad route.18

The spatial coverage limitation is particularly severe for the production area data. In 1967 twenty-five areas accounted for only three-fifths of the manufacturing in the nation, and for only forty-seven percent of the total shipments of manufactured goods recorded by the census.’9 Several major manufacturing areas (Kansas City, Grand Rapids, Indianapolis, and Portland, Oregon) each had more manufacturing employ- ment in 1967 than the Denver production area, which has the least manufacturing of the twenty- five reported.20 The coverage provided by the

‘8D. E. Church, PICADAD: A System f o r Ma- chine Processing of Geographic and Distance Factors in Transportation and Marketing Data (Washington, D. C.: U. S. Department of Commerce, Bureau of the Census, 1965).

19 Computations based on material in Bureau of the Census, Census of Manufactures, 1967, Vol. 1 , Summary and Subject Statistics, Chapter I : General Summary Tables, and Chapter 2: Size of Establish- ments (Washington, D. C.: U. S. Department of Com- merce, Bureau of the Census, 1968). Data available on computer tape for all states, and in the production area series for an extra twenty-five major centers of population as destination areas, provide a twenty-five by fifty-nine origin-destination matrix. A detailed description of printed and unpublished data plans for the 1973 census of transportation can be obtained from Bureau of the Census, Transportation Statistics Available f rom the Bureau of the Census, Data Access Descriptions, No. 34 (Washington, D. C.: U. S. De- partment of Commerce, Bureau of the Census, 1974).

20 Indianapolis and Kansas City were additional areas in the 1972 Census.

Page 6: SPATIAL INTERACTION DATA

1974 REVIEW ARTICLE 5 65

TABLE 2.-FREQUENCY TABULATION OF COMPARISON OF PRODUCTION AREA AND STATE FLOWS WITH NATIONAL TOTAL ESTIMATES, 1963

Sum of twenty-five area-to-area flows as a percentage of the national total estimate

Percentage no 1-10 10-19 20-29 30-39 40-49 50-69 national total

Number of commodities 3 6 7 13 7 7 4 3 50

exceeded

data estimate

Sum of state-to-state flows as a percentage of the national total estimate Percentage under40 40-49 50-59 60-69 70-79 80-89 90-99 total Number of commodities 4 2 7 1 1 6 11 9 50

Source: 1963 Interregional Commodity Trade Flow Estimates (Silver Spring, Maryland: Faucett Associates, 1971), Table 2.1, pp. 15-17.

production areas also varies considerably with type of commodity, and is particularly poor for industries like furniture manufacturing, most of whose production is outside the major manufacturing areas.

Further evidence of inadequate coverage was provided by a set of data for the 1963 census comparing estimated shipment totals for states and production areas with estimates of national shipments for fifty commodities.21 At most the twenty-five production areas ac- counted for sixty-four percent of the national shipments, and only four categories had values of fifty percent or more (Table 2) . The un- reliability of the data is shown by three cate- gories for which the sum of the estimates for production areas exceeded the national esti- mate. Even for the state-to-state flows, which encompass the whole country, only eight of the fifty commodities had percentages in excess of ninety, revealing that intrastate flows are im- portant elements in the total commodity flow system.

In general the Census of Transportation flow data do not meet the needs of filling all cells in a spatial interaction matrix. Too much hetero- geneity within regions detracts from the useful- ness of the data even where reliable and com- plete coverage does exist at the scale of the nine census divisions. There are problems where locationally detailed information is available for states and production areas, because states are not ideal units of analysis, and because there are significant sampling, definition, and

“1 “Sources of Data on Interregional Trade,” in 1963 Znterregional Commodity Trade Flow Estimates, op. cit., footnote 17, pp. 14-18.

omission errors for production areas. There is considerable difficulty in determining shipments within areas and in obtaining detailed com- modity information at all scales, since three- digit categories are usually too heterogeneous for useful analysis.

Other sources supplement some of the com- modity omissions of the census commodity transportation survey.22 Information may be ob- tained for the movement of commodities by rail and water, but no data are currently available for flows of air cargo, and practically no origin- destination data are available for pipeline or highway transportation of goods.

22 Flow data for nonmanufactured goods are re- viewed in I963 Interregional Trade Flow Estimates, op. cit., footnote 17, pp. 25-27. Data are available for mineral products in Bureau of Mines, Minerals Yearbook, Vols. 1-11, Metals, Minerals, and Fuels (Washington, D. C.: U. S. Department of the Interior, Bureau of Mines, annual); Bureau of Mines, Bitumi- nous Coals and Lignite: Changing Patterns in Distribu- tion and Markets, 1962-1964 (Washington, D. C.: U. S. Department of the Interior, Bureau of Mines, 1965); and for some farm products, Grain Trunspor- tation Statistics f o r the North Central Region, Statis- tical Bulletin No. 268 (Washington, D. C.: U. S. De- partment of Agriculture, Economic Research Service, 1960) ; Department of Agriculture, For-hire Motor Carriers Hauling Exempt Agricultural Commodities, Report 585 (Washington, D. C.: U. S. Department of Agriculture, Marketing Research Division, 1963 ) ; De- partment of Agriculture, The Traf ic Patterns of American Raw Cotton Shipments, Report 705 (Wash- ington, D. C.: U. S. Department of Agriculture, Mar- keting Research Division, 1965); and Department of Agriculture, Interstate Hauling of California-A rizona Fresh Fruits and Vegetables by Rail and Truck, Re- port 673 (Washington, D. C.: U. S. Department of Agriculture, Marketing Research Division, 1965). These data series are not comprehensive or recurring, and will not be discussed further.

Page 7: SPATIAL INTERACTION DATA

566 REVIEW ARTICLE December

Other Data Sources A long-established annual series of data for

rail transportation of commodities is prepared from a one percent sample of terminated way- bills for all but a small number of railroads in the nation.23 The series was transferred from the Interstate Commerce Commission to the De- partment of Transportation in 1967. Only one volume of the planned series has appeared since 1966, largely because of disputes among the different modes of transportation, and my remarks will refer to the Interstate Commerce Commission’s waybill statistic^.'^ Tabulations are provided for traffic volume, weight of ship- ment, and mileage block zones. Origin-destina- tion information is provided for numerous com- modities, up to the five-digit level, for the number of carloads, tonnage, revenues, ton- miles, car-miles, average length of haul, and average revenues for the five official Interstate Commerce Commission regions and for states. The data for regions are not very useful geo- graphically, and I will discuss only the state data.

The one percent sample base provides data with generally low sampling errors, although the sampling variability is higher for shipments

23 Waybill Statistics; their History and Uses (Wash- ington, D. C.: Intel-state Commerce Commission, Bureau of Transport Economics and Statistics, 1954) ; Department of Transportation, Carload Waybill Sta- tistics, 1969, One Percent Sample of Terminations in the Year, 1969, Territorial Distribution, Traffic and Revenue by Commodity Classes ( T D - 1 ) (Washington, D. C.: U. S. Department of Transportation, Office of Systems Analysis and Information, 1971 ) ; Depart- ment of Transportation, 1969 Waybill Territorial Dis- tribution, Straight-Line Mile Version ( T D - 2 ) ; Depart- ment of Transportation, I969 Mileage Distributions of Carloads f o r Selected Classes by Type o f Car ( T C - I ) ; Department of Transportation, Weight Distribution o f 1969 Carloads f o r Selected Commodity Classes by Type of Car ( T C - 2 ) ; Department of Transportation, Territorial Distribution of 1969 Carloads f o r Selected Commodity Classes by Type of Car ( T C - 3 ) ; Depart- ment of Transportation, Mileage Block Distribution of I969 Traffic and Revenue by Selected Commodity Classes, Territorial Movement, and Type of Rate (MB- I ) ; Department of Transportation, 1969 Waybill State-to-State Disttibution (SS-I t o SS-7) .

34 Carload Way bill Statistics, 1966, Territorial Dis- tribution, Traffic and Revenue by Commodity Classes ( T D - I ) (Washington, D. C.: Interstate Commerce Commission, 1968); Carload Waybill Statistics, 1966, State-tostate Distribution, Tragc , and Revenue (SS-I to SS-7) (Washington, D. C.: Interstate Commerce Commission, 1968).

between places than for traffic originated. Even though the sample design does not specifically cover areal units, the manner of allocating numbered waybills to originating carriers guar- antees no bias in the data by origin.25 The use of a constant sampling fraction means that less reliable coverage is given to areas with small amounts of traffic. Those few railroads which are omitted from the survey carry little trafficz6 The data suffer because they are tabu- lated only for states. In the absence of pub- lished data for smaller units it is not possible to aggregate to more meaningful areas. The tables also exclude intrastate and intercity traffic, which represents about one-third of the national rail traffic volume estimated by the Interstate Commerce Comrni~s ion .~~

The preparation and publication of data on the movement of freight by water is the re- sponsibility of the Corps of Engineers and Maritime Administration. The Corps has an annual series of statistics on the movement of freight, but most data relate to vessel charac- teristics and single-place-specific information for traffic at ports and on waterway segments.2x

2 3 R. T. Smith, “Technical Aspects of Transporta- tion Flow Data,” Journal of the American Statistical Association, Vol. 49 (1954), pp. 227-39. The sam- pling design is described in R. G. Rhodes and R. E. Briggs, “The I. C. C.’s Rail Carload Waybill Sample,” 1. C. C. Piactitionera Journal, Vol. 35 (1967/1968), pp. 235-51. Sampling variability figures are not pro- vided in the Interstate Commerce Commission’s pub- lications, but are given in the one publication to ap- pear so far from the Department of Transportation.

Ll(i The excluded railroads are not engaged in inter- state commerce, or have average operating revenues over a three-year period of $3,000,000 or less.

2T 1963 Interregional Commodity Trade Flow Esti- mates, op. cit., footnote 17, Table 2.2, p. 22, compares estimates of national rail traffic of manufactured goods for 1963, and provides these figures: 430 million tons of traffic according to the Census of Transportation; 478 million tons according to national figures of the Intel state Commerce Commission as obtained from the Annual Report on Transport Statistics in the United States (Washington, D. C.: Interstate Com- merce Commission, annual); and 420 and 318 mil- lion tons according to the Interstate Commerce Com- mission waybill statistics for territories and states respectively. The quality of the Interstate Commerce Commission data is also affected by the inclusion of export, import, transit, and rebilled traffic, the ex- clusion of traffic moving at less than carload, and the unreliability of estimates for more than carload traffic.

2 s Details of the data for waterborne commerce are given in W. A. C. Connelly, “Statistics on Waterborne

Page 8: SPATIAL INTERACTION DATA

1974 REVIEW ARTICLE 567

One segment of the series does provide some origin-destination data for inland water move- ment.”’ Shipment volumes are given for over thirty commodity categories within six major groups: grains and sugar, logs and lumber, bituminous coal and lignite, petroleum and petroleum products, iron and steel, and chemi- cals. These accounted for three-quarters of all domestic water commerce in 1968. The flow values are given by shipping and receiving areas for specific ports and waterways, but com- modity coverage is incomplete. The published material of the Maritime Administration relates primarily to vessels and the labor force, but does include some flow data for lake and coastal traffic for ten large regional units for specific commodities, and on a port-to-port basis for the aggregate freight movement. 30

Only little dyad-specific information on com- modity flows by waterborne carriers is available, but data from the Corps of Engineers and Mari- time Administration do supplement the Census of Transportation and railroad waybill series. A new statistical series of the Bureau of the Census fills another significant gap in com- modity flows data.31 For 1970 tonnage figures are available for the state of origin of exports and state of destination of imports, with cross- tabulations by customs area, foreign area, and

Commerce Compiled by the Corps of Engineers, U. S. Army,” Highway Research Record, Vol. 82 (1965),

Corps of Engineers, Waterborne Commerce of the United States: Part I, Atlantic Coast; Part 2 , Gulf Coast, Mississippi River System; Part 3, Great Lakes; Part 4, Pacific Coast, Alaska, and Hawaii; and Part 5, National Summaries (Vicksburg, Mississippi: U. S. Department of the Army, Corps of Engineers, an- nual). Waterborne commerce is classified into several categories. Domestic or inland traffic is defined as traffic moving between ports, rivers, or water channels of the United States mainland, Alaska, Hawaii, Puerto Rico, and the Virgin Islands. This domestic traffic is subdivided into internal, lake, coastal, and local traffic. Internal traffic moves on inland waterways, coastal over an ocean or the Gulf of Mexico, lakewise on the Great Lakes system between United States ports, and local within a single port.

30The latest published data are Maritime Admin- istration, Domestic Oceanborne and Great Lakes Com- merce o f the United States, 1963 (Washington, D. C.: U. S. Department of Commerce, Maritime Admin- istration, 1963).

31 Bureau of the Census, Domestic and International Transportation of U . S. Foreign Trade: 1970 (Wash- ington, D. C.: U. S. Department of Commerce, Bu- reau of the Census, 1972).

pp. 32-37.

mode of transportation. Commodity detail is provided only for the national total of exports and imports.

The glaring omission is highway traffic. Practically no information can be obtained for that one-fourth of the nation’s flow of com- modities which moves by truck transportation, nor is the material available for air cargo flows. The Department of Transportation is attempt- ing to fill these two major gaps by obtaining origin-destination data for air freight from a waybill sample, and for highways by a survey of traffic carried by a sample of privately owned trucks in each state.3z The small size of the truck sample will probably restrict the loca- tional detail available, although the cost of ob- taining data even for a small sample is tre- mendous, and there is a problem in achieving cooperation from airlines and

:E Information about the air freight survey was obtained from Mr. Frank Macklin, U. S. Department of Transportation; G. J. Boetje, System Description and Specification: Air Freight Data Processing Sys- tem (Cambridge, Massachusetts: Department of Aero- nautics and Astronautics, Massachusetts Institute of Technology, 1972) ; and Department of Aeronautics and Astronautics, Massachusetts Institute of Tech- nology, Recommendations f o r Air Freight Data Study (Cambridge, Massachusetts: Massachusetts Institute of Technology, 1971). Information about the truck suivey came from Mr. Nathaniel Lieder, U. S. De- partment of Transportation; and Federal Highway Administration, Tjuck Commodity Flow Study, Pro- cedural Guide (Washington, D. C.: U. S. Department of Transportation, Federal Highway Administration, 1972).

83 Whitten, op. cit., footnote 3. The Department of Transportation, Transportation Information-a Re- port to the Committee on Appropriations, U . S. House of Representatives, f rom the Secretary of Transporta- tion (Washington, D. C.: U. S. Department of Trans- portation, 1969), proposed a $36,000,000 program for gathering information for intracity and intercity flows of goods and persons. 1963 Interregional Commodity Trade Flow Estimates, op. cit., footnote 17, estimated that the annual Interstate Commerce Commission way- bill series cost about three-quarters of a million dol- lars for preparation. Methods for obtaining data on the transportation of freight by highway are de- scribed in N. Lieder, “Mail Survey to Collect Truck Commodity Data,” in Urban Commodify Flow, Spe- cial Report No. 120 (Washington, D. c . : Highway Research Board, National Research Council-Na- tional Academy of Sciences, 1971), pp. 182-96. K. J . Dueker and R. J. Zuelsdorf, “Motor Carrier Data and Freight Modal Split,” Highway Research Record, Vol. 322 (1970), pp. 1-12, review different methods of collection of such data. A review of data programs of federal agencies and existing data sources for inter- urban goods movement is F. T. Bolger and H. W.

Page 9: SPATIAL INTERACTION DATA

REVIEW ARTICLE Deccm bcr 568

Despite problems in the data, a single inte- grated set of origin-destination data for state units has been developed for the year 1963 as part of an interregional input-output analysis of the United States.34 A matrix of estimates for tonnages and value of flows for sixty-one industry categories and forty-four states or groups of adjacent states has been prepared. Other matrices are planned for census divisions and production areaszs5 The flow estimates use commodity flow data, value of total outputs by state, demand for goods by states, and alloca- tions of imports. Despite the weaknesses attrib- utable to data limitations and to the assump- tions used in the estimation procedures, this is a relatively reliable and comprehensive data base for analysis of some aspects of the economy of the United States, for predicting flow patterns at future dates, and for forecasting the regional redistribution of economic activity that might result from major public investment.““

Except for this one major source, spatial interaction studies of commodity flows are seriously handicapped by data problems. Nu- merous authors have developed models of opti- mum or expected flows, but have been unable to test their pr~positions.:’~ Much of the argu- ment over the relative merits of linear pro- gramming and gravity models for predicting ___-

Bruck, A n Overview o f Urban Goods Movement Proj- ects and Data Sources (Cambridge, Massachusetts: Urban Systems Laboratory, Massachusetts Irstitute of Technology, 1972), Part 11, pp. 83-151.

3* K. R. Polenske, A Multiregional Input-Output Model for the United States, Report No. 21 (Washing- ton, D. C. : U. s. Department of Commerce, Economic Development Administration, 1970). This data set has recently become available on computer tape via the National Technical Information Service, U. S. Department of Commerce, Springfield, Virginia.

3.5 1963 Interregional Commodity Trade Flow Esti- mates, op. cit., footnote 17.

36 For details of the estimation procedures see 1963 Interregional Commodity Trade Flow Estimates, op. cit., footnote 17; and Polenske, op. cit., footnote 34. FOJ the methodology of interregional trade analysis using these data see Shaw, op. cit., footnote 9; Bu- reau of Economic Analysis, ToM<ard Development of a National-Regional Impact Evaluation System and the Upper Licking Pilot Study, Staff Paper in Eco- nomics and Statistics, No. 18 (Washington, D. C.: U. S. Department of Commerce, Bureau of Economic Analysis, 197 1 ) ; and Olson, op. cit., footnote 9.

37 Moses, op. cit., footnote 9; R. L. Morrill and W. L. Garrison, “Projections of Interregional Pat- terns of Trade in Wheat and Flour,” Economic Geog- raphy, Vol. 35 (1960), pp. 116-26; Perle, op. cit., footnote 9; and Black (1971), op. cit., footnote 9.

interregional commodity flows is caused by in- ability to obtain locationally detailed data for homogeneous commodity categories.:jR Data limitations make difficult the determination of flow patterns for descriptive purposes even if only ordinal measurement is used.”“ There are also doubts about even the matrix marginal totals of traffic generated and terminated, and the within-area flows. It is sad that the best commodity flow maps in almost twenty years are a pair which show the aggregate flows of commodities among the nine census divisions.40

NONMIGRATORY MOVEMENTS OF P E O P L E

Studies of nonmigratory intercity population movements require data on locational, trans- portation, and traveler characteristics. It is not easy to fill the cells of an origin-destination matrix even for aggregate movement. A variety of information is available on the use of differ- ent modes of transportation, but it is oriented towards counts of traffic densities at specific points on highways or at airports. Flow data for the nation are lacking except for a survey of air passenger traffic and some information in the census of transportation.

In an annual survey of air passenger traffic data are provided on the numbers of passengers on certificated route carriers for pairs of cities.41 It is possible to obtain origin-destination data for most of the domestic air traffic by scheduled services. Reliable and accurate flow volumes are generated from a ten percent sample of passenger tickets on the originating route air carriers.42 No details about the sampling error

:iH K. Mera, “An Evaluation of Gravity and Linear Programming Models for Predicting Interregional Commodity Flows,” in Meyer, op. cit., footnote 9, pp. 297-308.

R!) Spiegelglas, op. cit., footnote 9. 4o Bureau of the Census, Shipments o f Commodities

by Manufacturers in the Conterminous United States, Map No. 26, Outflow f rom Census Geographic Divi- sions: 1963; and Map No. 27, Inflow to Census Geo- graphic Divisions: 1963, Map Series GE-50 (Washing- ton, D. C.: U. S. Department of Commerce, Bureau of the Census, undated).

41 Civil Aeronautics Board, Origin-Destination Sur- vey of Airline Passenger Traffic, Domestic Trafic (Washington, D. C.: Civil Aeronautics Board and the Air Transport Association of America, annual).

1“ Information on the technicalities and limitations of the sample may be obtained from Civil Aeronautics Board, Instructions to A i r Carriers for Collecting and Reporting Passenger Origin-Destination Survey Statis- tics (Washington, D. C.: Civil Aeronautics Board, re-

Page 10: SPATIAL INTERACTION DATA

1974 REVIEW ARTICLE 5 69

are published, but it is probably not great even for city pairs with small volumes of traffic. On the other hand, there are difficulties in identify- ing true origins and destinations, and the stops on a multistop trip. A number of fairly elaborate procedures ensure consistency in the data, and tables provide information on the routing of journeys between different cities.43 The magni- tude of errors arising from double-counting and incorrect specification of origin or destination city are not known.44 Another limitation to the data is the lack of information on the character- istics of travelers or trip purposes.

Despite the drawbacks, this data set is per- haps the best spatial interaction matrix avail- able, and it has been used for revealing aspects of the organization of air traffic, for calibrating and testing distance decay, gravity, and inter- vening opportunities models, and for determin- ing the demand for air transportation between places.45

vised edition, 1970) and Civil Aeronautics Board, op. cit., footnote 41.

43 Civil Aeronautics Board, op. cit., footnote 41, Introduction.

44 Extensive and costly adjustments have been made to the origin-destination data to obtain a matrix of flows for true origins and destinations; R. H. Ellis et al., “Consideration of Intermodal Competition in the Forecasting of National Intercity Travel,” Highway Research Record, Vol. 369 (1971), pp. 253-61. One major adjustment to the Civil Aeronautics Board data consists of allocating travel for some 141 zones not included as air hub data points in the original data.

43 Descriptive studies include D. M. Belmont, “A Study of Airline Interstation Traffic,” Journal of Air Law and Commerce, Vol. 25 (1958), pp. 361-68; J. B. Lansing and D. M. Blood, “A Cross-Section Anal- ysis of Non-Business Air Travel,” Journal of the American Statistical Association, Vol. 53 (1958), pp. 928-47; J. B. Lansing et al., “An Analysis of Interur- ban Air Travel,” Quarterly Journal of Economics, Vol. 75 (1961), pp. 87-95; S. B. Richmond, “Inter- spatial Relationships Affecting Air Travel,” Land Eco- nomics, Vol. 3 3 (1957), pp. 65-73; E. J. Taaffe, “Air Transportation and the U. S . Urban Distribution,” Geographical Review, Vol. 46 (1956), pp. 219-38; E. J. Taaffe, “Trends in Airline Passenger Traffic,” Annals, Association of American Geographers, Vol. 49 (1959), pp. 393-408; and Taaffe, op. cit., footnote 4. The air traffic data were used to calibrate models in R. E. Alcaly, “Aggregation and Gravity Models: Some Empirical Evidence,” Journal of Regional Sci- ence, Vol. 7 (1967), pp. 61-74; C. Hammer and F. C. Ikle, “Intercity Telephone and Airline Traffic Related to Distance and the Propensity to Interact,” Sociom- etry, Vol. 20 (1957), pp. 306-16; E. Howrey, “On the Choice of Forecasting Models for Air Travel,” Journal of Regional Science, Vol. 9 (1969), pp. 215-

Other forms of intercity movement of people are inadequately covered. Despite the tremen- dous volume of traffic, very few data on high- way flows between places are available. There is no uniform, regular, or extensive coverage of highway traffic, although sample surveys have been conducted in California and the Northeast Corridor.46 Even though the latter survey was a large project, origin-destination data were ob- tained for only nineteen major city pairs in Megalopolis. The chief contribution of this survey lies in its material on traveler charac- teristics.

No national flow data are available to the public on the small passenger movement by rail, although individual railroads have traffic in- formation. Some statistics are available for the

24; F. C. Ikle, “Sociological Relationship of Traffic to Population and Distance,” Traftic Quarterly, Vol. 8 (1954), pp. 123-36; W. H. Long, “Air Travel, Spatial Structure, and Gravity Models,” Annals of Regional Science, Vol. 4 (1970), No. 2, pp. 97-107; J. R. Meyer, M. R. Straszheim, and J. F. Kain, “Modeling Intercity Passenger Demand,” pp. 137-64 in Meyer, op. cit., footnote 9; and R. Quandt and W. Baumol, “The Demand for Abstract Transportation Modes: Theory and Measurement,” Journal of Regional Sci- ence, Vol. 6 (1966), pp. 13-26. The air traffic data were used to estimate traffic demand in S. L. Brown and W. S. Watkins, “The Demand for Air Travel: a Regression Study of Time-Series and Cross-Sectional Data in the United States Domestic Market,” High- way Research Record, Vol. 213 (1968), pp. 21-34; Ellis et al., op. cit., footnote 44; L. B. Lave, “The Demand for Intercity Passenger Transportation,” Journal of Regional Science, Vol. 12 (1972), pp. 71- 84; W. H. Long, “City Characteristics and the De- mand for Interurban Air Travel,” Land Economics, Vol. 44 (1968), pp. 197-204; W. H. Long, “Airline Service and the Demand for Intercity Air Travel,” Journal of Transport Economics and Policy, Vol. 3 (1969), pp. 287-99; R. Quandt and W. Baumol, “The Demand for Abstract Transportation Modes: Some Hopes,” Journal of Regional Science, Vol. 9 (1969), pp. 159-62; R. E. Quandt and K. H. Young, “Cross- Sectional Travel Demand Models: Estimates and Tests,’’ in Mathematica, op. cit., footnote 3, Vol 111, pp. 39-74; and S. B. Richmond, “Forecasting Air Passenger Traffic by Multiple Regression Analysis,” Journal of Air Law and Commerce, Vol. 22 (1955),

46 Mathernatica, op. cit., footnote 3; Federal Rail- road Administration, The Needs and Desires of Trav- elers in the Northeast Corridor (Washington, D. C . : U. S. Department of Transportation, Federal Railroad Administration, Office of High Speed Ground Trans- portation, 1970); and A. M. Voorhees and Associates, The Northeast Corridor Intercity Travel Survey: Air, Auto, and Bus Modes (Washington, D. C.: Office of Systems Analysis and Information, 1971 ).

pp. 434-43.

Page 11: SPATIAL INTERACTION DATA

570 REVIEW ARTICLE December

Northeast Corridor as a result of comparisons between the Metroliner and conventional trains on the Washington-New York and New York- Boston routes4?

Plans are underway to expand the Northeast Corridor survey to other parts of the nation, and to tabulate rail passenger traffic data from AMTRAK, but there is little hope of obtaining locationally detailed information for bus or automobile Unfortunately the Census of Transportation is of little help. The National Travel Survey presents considerable informa- tion for the nation as a whole on the volume of travel, means of transportation, trip character- istics, traveler characteristics, and purpose of travel, but provides little useful material on

The only spatial interaction information for 1967 is for the four census regions. The sample base is too small to provide reliable in- formation for other locational scales.5o

Flow patterns of person-movements for busi- ness or personal travel can be studied at only a very coarse scale. It is virtually impossible to piece together information from miscella- neous sources, and only the air data source is reliable and provides good coverage. The in- formation on spatial linkages necessary for effective planning of transportation facilities and making investment decisions is not available.

47 Federal Railroad Administration, Rail Passc~ngrr Statistics it7 the Northcast Corridor, 1 971 (Washing- ton, D. C.: U. s. Department of Transportation, Fed- eral Railroad Administration, Office of High Speed Ground Transportation, 1972).

48 Personal communication from Mr. David De- Boer, U. S. Department of Transportation.

49 Bureau of the Census, Census of Transportation, 1967, Volume I, National Travel Survey (Washington. D. C.: U. S. Department of Commerce, Bureau of the Census, 1968) ; and Census of Transportation, 1972, Volume I, National Travel Survey (Washington, D. C.: U. S. Department of Commerce, Bureau of the Census, 1973). The 1972 survey has more informa- tion than the 1967 survey, including separate reports on spring and summer travel as well as a report cover- ing all of 1972.

30 For details of the sample design for 1967 see Census of Trunsportatioir, 1967, Volume I, National Travel Survey, op. cit., footnote 49, pp. 85-87; and for 1972, see Census of Transportation, 1972, Volume I, National Trai,el Survey, op. cit., footnote 49, pp. vii-xi. Through the use of a larger sample the 1972 survey will have data for eight travel regions and selected states. More information on the 1972 National Travel Survey may be had from Transportation Statistics Available from the Bureau of the Census, op. cit., footnote 19.

Any prediction of the volume of intercity traffic must be based on many untestable assumptions. Forecasting of the demands for different modes of transportation for particular areas is difficult and dangerous. There is a noticeable imbalance between the rather sophisticated models now available and the data base necessary for their calibration and

P0PUI.ATION MIGRATION

The fundamental data need for analysis of population migration is an origin-destination matrix which includes as much areal detail as possible. Additional material for the charac- teristics of migrants is desirable. More data are available on mobility status, the migration com- ponent of population change, and migration intensities than for the migration streams be- tween places.

Migration data are commonly derived from census counts, population registers, special sur- veys, and miscellaneous records of addresses such as held by public utility Data

These models are repr-esented by R. Gronau and M. Alcaly, “Th: Demand for Abstract Transportation Modes : Some Misgivings,” Journul of Regional Sci- c’/7c.e, Vol. 9 (1969), pp. 153-57; Howrey, op. cit.. footnote 45; J. B. Lansing and D. M. Blood, “A Cross- Section Analysis of Non-Business Air Travel,” Journal of the An7c.r.icun Statistical Association, Vol. 53 (1958). pp. 928-47; Long, op. cit., footnote 45; Math- ematica, op. cit., footnote 3; Meyer, op. cit., footnote 45; S. Monsod, “A Cross-Sectional Model for the Demand for Passenger Service in the Northeast Cor- ridor,” in Mathematica, op. cit., footnote 3 ; R. E. Quandt, The Deniurul fo r Travel: Theory and Mea- surement (Lexington, Massachusetts: Heath Lexing- ton Books, 1970); and Quandt and Baumol, op. cit., footnote 45. .x Sources of data on migration are discussed in

J . F. Hart, “The Changing Distribution of the Anieri- can Negro,” Annals, Association of American Geog- I-aphers, Vol. 50 (1960), pp. 242-66; A. J . Jaffe, Handbook of Statistical Methods for Demog- raphers (Washington, D. C.: U. S. Department of Commerce, Bureau of the Census, 1966); H. S. Shry- ock et al., The Metlrorls und Materials of Demog- raphy (Washington, D. C.: U. S. Department of Commerce, Bureau of the Census, 1971); and H. S. Shryock and E. A. Larmon, “Some Longitudinal Data on Internal Migration,” Lkmography Vol. 2 (1965), pp. 579-92. Special surveys and address files generally do not provide detailed coverage or comprehensive data, but three surveys are worthy of note. Some ten years ago the Survey Research Center of the Institute for Social Research of the University of Michigan ob- tained migration data from a nationwide sample of about 3,000 households, J . B. Lansing and E Mueller, The Geographic Mohility of Labor (Ann Arbor, Michigan: Institute for Social Research, University of

_ _ _ ~ _

Page 12: SPATIAL INTERACTION DATA

I974 REVIEW ARTICLE 571

are obtained by recording the place of residence of the population at the time of the census and at some earlier date. A five-year period was used in 1940, 1960, and 1970 population censuses, and a one-year period was used for the 1950 c e n s u ~ . ~ : ~ Migration streams are given for divisions, states, and cities of 100,000 or more people in the 1935-40 data series, for states in the 1949-50 data series, and for di- visions, states, economic subregions, state eco- nomic areas (SEAs), and standard metro- politan statistical areas (SMSAs) in the 1955- 60 and 1965-70 data series.54 Sex, race, and

~

Michigan, 1967), and it is conducting a larger nation- wide sample survey which includes some data on residential changes, Institute for Social Research, A Longitudinal Study of Family Economics (Ann Arbor, Michigan: Institute for Social Research, University of Michigan, 1969). The Office of Economic Opportunity Survey of 1967 provided some migration data on a state basis, and these have been used by C. L. Beale, Statement at the Hearings on Population Trends Be- fore the A d Hoe Subcommittee on Urhan Growth of the Committee on Banking and Currency, House of Representatives, 9lst Congress, 1st Session, Hearings, No. 72, 19.59, Part 1, pp. 473-508; R. F. Wertheimer, The Monetary Rewards of Migration Within the United States (Washington, D. C.: Urban Institute, 1970); and P. A. Morrison, “Chronic Movers and the Future Redistribution of Population : A Longitudinal Analysis,” Demography Vol. 8 (1971), pp. 171-84. The sample size in all these data series is too small to justify migration stream research.

,xi H. S. Shryock, Population Mobility Within the United States (Chicago: University of Chicago, 1964) discussed the selection of an appropriate time period. The five-year period is a compromise between the one- year interval, which guarantees greater accuracy be- cause people are better able to remember, and the ten-year interval, whose main advantage is compati- bility with the census decade, but which has larger re- sponse errors. Shryock also discussed the atypicality of the 1949-50 residential change data compared with years earlier in the decade.

~7.i Bureau of the Census, Sixteenth Census of the United States: Population, Internal Migration, 1935- 1940, State of Birth of the Native Population; Six- teenth Census of the United States: Population, I n - ternal Migration, 1935-1 940, Age of Migrants; Social Churacteristics of Migrants; Economic Characteristics of Migrants; Color and Sex of Migrants (Washington, D. C.: U. S . Department of Commerce, Bureau of the Census, 1943); Bureau of the Census, U. S. Census of Population, 1950, State of Birth, Special Report P-E, No. 4A (Washington, D. C.: U. S . Department of Commerce, Bureau of the Census, 1956): Bureau of the Census, U . S. Census of Population, 1950 Pop- ulation Mobility-States and State Economic Areas, Special Report P-E No. 4B (Washington, D. C.: U. S. Department of Commerce, Bureau of the Census, 1956); Bureau of the Census, Census of Population,

residence characteristics of migrants have oc- casionally been provided for states and large cities, as in the 1935-40 data, but such material is not generally available for smaller units. Data on migrant characteristics are provided only for the nine census divisions for 1950 and 1960, but data for certain social characteristics have been made available for state economic areas for the 1965-70 migration streams.

The special volume of the 1970 Census of Population on migration streams lists the volume of Negro migration by state of origin in 1965 and SEA of residence in 1970 as well as the complete 5 10 by 5 10 origin-destination matrix for all migrants for SEAS.^^ Single- place-specific information is tabulated for age and sex of all in- and out-migrants and for Negro migrants, for SEAs. The 1965-70 mi- gration data are based on a fifteen percent sample, with large sampling errors for low migration volumes. The most valuable informa- tion is the matrix cross-tabulation of place of 1965 (or 1955) residence with place of resi- dence in 1970 (or 1960) for 510 state eco-

1960, Subject Reports: State of Birth (PC(2)-2A) (Washington, D. C.: U. S. Department of Commerce, Bureau of the Census, 1963): Bureau of the Census, Census of Population, lY60, Subject Reports: Mohility for States and State Economic Areas (PC(2)-2B) (Washington, D. C.: U. S . Department of Commerce, Bureau of the Census, 1964); Bureau of the Census, Census of Population, 1960, Subject Reports: Mobility f o r Metropolitan Arc~as (PC(2)-2C) (Washington, D. C.: U. S. Department of Commerce, Bureau of the Census, 1964); Bureau of the Census, Census of Pop- ulution, 1960, Subject Reports: Lifetime and Recent Mixration (PC(2)-2D) (Washington, D. C.: U. S. Department of Commerce, Bureau of the Census, 1963); Bureau of the Census, Census of Population, 1960, Subject Reports: Migration Between State and Economic Areas (PC(2)-2E) (Washington, D. C . : U. S. Department of Commerce, Bureau of the Cen- sus, 1967); and Bureau of the Census, Census o f Pop- ulation 1970, Suhject Reports: PC (2)-2A, State of Birth; PC(2)-2B, Mobility f o r States and the Nation; PC(2)-2C, Mobility f o r Metropolitan Areas (Wash- ington, D. C.: U. S. Department of Commerce, Bureau of the Census, 1973-4). The potential of the 1970 census of population for migration research is re- viewed in L. H. Long, “Migration Research and the 1970 Census,’’ in A. L. Ferris and E. S. Lee, eds., Research and the 1970 Census (Oak Ridge, Tennes- see: Southern Regional Demographic Group, Oak Ridge Associated Universities, 1971), pp. 121-29.

55 Bureau of the Census, Census of Population, 1970, Subject Reports: Migration Between State Eco- nomic Areas, PC(2)-2E (Washington, D. C.: U. S . Department of Commerce, Bureau of the Census, 1972).

Page 13: SPATIAL INTERACTION DATA

572 REVIEW ARTICLE December

nomic areas and for large metropolitan areas, which permits examination of the complete systems of flows.”” Various segments of the

56Adams, Borchert, Lowry, and Wolpert have al- ready undertaken basic empirical work for the 1955- 1960 data for metropolitan areas, and Lycan and Schwind have used data for divisions and for state economic areas, respectively; portions of the country have been examined by Beale, Chetwynd and Richter, Rogers, Roseman, Rutman, and Schwind: R. B. Adams, “U. S. Metropolitan Migration: Dimensional Predictability,” Proceedings, Association of American Geographers, Vol. 1 (1969), pp. 1-6; 3 . R. Borchert, “America’s Changing Metropolitan Regions,” Annals, Association of American Geographers, Vol. 62 ( 1972), pp. 352-73; I. S. Lowry, Migration and Metropolitan Growth: T w o Analytical Models (San Francisco: Chandler Publishing Company, 1966) ; J . Wolpert, “Distance and Directional Bias in Interurban Migra- tion Flows,” Annuls, Association of American Geog- raphers, Vol. 57 (1967), pp. 605-16; R. Lycan, “Matrices of Interregional Migration,” Proceedings, Association of American Geographers, Vol. 1 ( l969) , pp. 89-95; Schwind, op. cit., footnote 4; Beale, op. cit., footnote 52; E. Chetwynd and C. Richter, “Inter- nal Migration in a Low Income Region: The Atlantic Coastal Plains,” Review of Regional Studies, Vol. 2 (1971), No. I , pp. 83-88; A. Rogers, “A Regression Analysis of Interregional Migration in California,” Review of Economics und Stutisticr, Vol. 49 (1967), pp. 262-67; C. C. Roseman, “Channelization of Migra- tion Flows from the Rural South,” Proceedings, As- sociation of American Geographers, Vol. 3 ( l971) , pp. 140-46; G. L. Rutman, “Migration and Economic Opportunities in West Virginia: A Statistical Anal- ysis,” Rurul Sociology, Vol. 35 (1970), pp. 206-17; and Schwind, op. cit., footnote 4. Studies of the data from earlier censuses are small in number, but include Shryock, op. cit., footnote 53; D. J . Bogue et al., Sub- regional Migrution in f l i t 7 United States, 1935-1 940, Vol. I , Streams of Migration Between Subregions (Oxford, Ohio: Scripps Foundation for Research in Population Problems, 1957) ; and W. S. Thompson. Migrution Within Ohio 1935-1940: A Study of the Re-distribution of Popirlution (Oxford, Ohio: Scripps Foundation for Research in Population Problems, 1951). A historical perspective to migration may be obtained by comparison of the migration data from different censuses, as in J. D. Tarver, “Differentials and Trends in Actual and Expected Distance of Move- ment of Interstate Migrants,” Rural Sociology, Vol. 36 (1971), pp. 563-71; J. D. Tarver and R. D. Mc- Leod, “Trends in Distances Moved by Interstate Mi- grants,” Rural Sociology, Vol. 35 (1970), pp. 523-37; or by use of place-of-residence-by-place-of-birth in- formation; H. T. Elridge and D. S. Thomas, Demo- graphic Analyses and Interrelations, Part 111 of E. S. Lee et al., Methodo1ogic.d Considerations and Refer- ence Tables, Vol. 1 of S. S. Kuznets and D. S. Thomas, Population Redistribution and Economic Growth, United Statas, 1870-1 950 (Philadelphia: The American Philosophical Society, 1964) ; Hart, op. cit., footnote 52; and R. L. Morrill and 0. F. Donaldson,

total matrix for 1955-60 have been used to test hypotheses about migration or to illustrate conceptual or technical matters.57 The authors of these studies have made little reference to the quality of the data used.5x

“Geographical Perspectives on the History of Black Americans,” Economic Geography, Vol. 48 ( l972) , pp. 1-23. Place of birth data since 1850 are available in Bureau of the Census, Historical Statistics of the United Stutes: Colonial Times to 1957 (Washington, D. C.: U. S. Department of Commerce, Bureau of the Census, 1960), and in special surveys of recent cen- suses: Bureau of the Census, Sixteenth Census of the United States: Population, Internal Migrution, 1935- 1940, State of Birth of the Native Populutiorl; Six- teenth Census of the United States: Population, Inter- nal Migration, 1935-1940: Age of Migrants; Social Characteristics of Migrants; Economic Characteristics of Migrants; Color and Sex of Migrants (Washington, D. C.: U. S. Department of Commerce, Bureau of the Census, 1943); Bureau of the Census, U . S. Census o f Population, 1950, State of Birth, Special Report P-E, No. 4A (Washington, D. C.: U. S. Department of Commerce, Bureau of the Census, 1956); and Bureau of the Census, Census of Pop- ulation, 1960, Subject Reports: State o f Birth, PC(2)- 2A (Washington, D. C.: U. S. Department of Com- merce, Bureau of the Census, 1963). The migration data are based solely on the difference between ai-ea of birth and current residence; intermediate moves, which may be numerous, are not known, and only ultimate changes may be determined. The data ale also subject to inaccuracies because of state boundary changes, difficulties in remembering on the part of older people, and omission of intrastate move- ments; Shryock et al., op. cit., footnote 52.

.)‘ B. I. L. Berry and P. J . Schwind, “Information and Entropy in Migration Flows,” Geographicul Anal- ysis, Vol. 1 (1969), pp. 5-14; L. A. Brown and F. E. Horton, “Functional Distance: An Operational Ap- proach,” Geographical Analysis, Vol. 2 ( 1970), pp. 70-83; 0. R. Galle and K. E. Taeuber, “Metropolitan Migration and Intervening Opportunities,” Americun Sociological Revicw, Vol. 31 (1966), pp. 5-13; M. J . Greenwood, “An Analysis of the Determinants of Geographic Labor Mobility in the United States,” Review of Economics and Statistics, Vol. 51 (1969), pp. 189-94; M. J. Greenwood and P. J . Gormerly, “A Comparison of the Determinants of White and Nonwhite Interstate Migration,” Demography, Vol. 8 (1971), pp. 141-55; Tarver, op. cit., footnote 56; J . D. Tarver and W. R. Gurley, “A Stochastic Analysis of Geographic Mobility and Population Projections of the Census Divisions in the United States,’’ Demog- raphy, Vol. 2 (1965), pp. 134-39; Tarver and Mc- Leod, op. cit., footnote 56; J. D. Tarver and P. M. Skees, “Vector Representation of Interstate Migration Streams,” Rural Sociology, Vol. 32 (1967), pp. 178- 93; J . D. Tarver et al., ‘Vector Representation of Migration Streams Among Selected State Economic Areas During 1955 to 1960,” Demography, Vol. 4 (1967), pp. 1-18; and Wolpert, op. cit., footnote 4.

.x One major exception is K. E. Taeuber and A. F.

_-

Page 14: SPATIAL INTERACTION DATA

1974 REVIEW ARTICLE 573

The volume of migration is underestimated by data based on change of place of residence during any given period, because such data fail to record moves within the period, moves of those who died during the period, and moves of those who returned to their original place of residence. Inaccuracies of this sort probably are small if the period is short, although no data reveal the magnitude of this error, nor are the spatial variations in the error known. Response and other non-sampling errors are regarded as small.59

On the other hand, there are limitations be- cause of sampling errors. Although a twenty- five percent sample guarantees a moderately low sampling variability, sampling error becomes fifty percent or more for flow volumes of fifty people or less, and twenty percent for 250. Areas with small flow volumes are quite nu- merous in the SEA flow matrix, and some 7,500 cells out of the possible 14,500 have migration volumes with at least twenty percent error, for one standard error of confidence.6o

The major drawbacks in these data are not undercounting and sampling errors, but the lack of information on migrant characteristics for pairs of small areal units, although data cross-classified by race, sex, and type of resi- dence can be obtained for nine census divisions, and for white/nonwhite differences for states and some metropolitan areas. Institutional populations are included in the aggregate mi- gration flow volume, and it is not possible to determine readily the magnitude of this kind of contamination for particular flows.”’ Other

Taeuber, “The Changing Character of Negro Migra- tion,” American Journal of Sociology, Vol. 70 (1965), pp. 429-41, who state: “Precise determination of the differing character and relative importance of each migration stream requires a large body of systematic data. Unfortunately much of the literature has been based on data that are not only unsystematic but empirically untrustworthy.”

m Bureau of the Census, Census of Population, 1960, Subject Reports: Migration Between State Eco- nomic Areas, PC(2)-2E (Washington, D. C.: U. S. Department of Commerce, Bureau of the Census, 1967), p. xvi.

60 My estimate; Bureau of the Census, op. cit., foot- note 59, pp. xiv-xvi has a brief account of sampling errors, and Table A has estimates of the variability for flows of different magnitude.

61Taeuber and Taeuber, op. cit., footnote 58, say that one-third of the nonwhite migrants to Atlanta from other metropolitan areas moved to group quar- ters, in particular a large federal prison and Atlanta University.

problems include uncertainty about the repre- sentativeness of the chosen time interval for measuring migration, the impact of the boun- daries of areal units on that measurement, and the correlation between the recorded movement and the scale of areal units.

Sample data from the Social Security Ad- ministration’s records of employees are increas- ingly used as an alternative source of informa- tion on the geographic mobility of people.6* In this series data are available on a yearly basis for a one percent sample of all workers covered by social security, now about ninety million people. Most of the statistics relate to employee characteristics and earnings, but place and industry of employment are recorded. It is possible to determine, with some error, the migration of the same group of people over a time period, and associated features of em- ployment, earnings, age, sex, and racial charac- teristics. Spatial interaction matrices for mi- gration at different scales can be derived by aggregating the basic information for counties. Numerous migration studies have used these data.63 This material has been used to investi-

62 Social Security Administration, Basic Statistical Data Files Available to Outside Researchers (Wash- ington, D. C.: Department of Health, Education and Welfare, Social Security Administration, Office of Re- search and Statistics, 1971). This data source is re- viewed in D. Hirschberg, “The Social Security Admin- istration’s One-Percent Sample,” in Ferris and Lee, op. cit., footnote 54, pp. 167-74.

G3 L. L. Bauer and C. B. Sappington, “Some Uses at the State Level of the Social Security Work His- tory Sample Data,” in The Labor Force: Migration, Earnings, and Growth, Bulletin Y-63 (Muscle Shoals, Alabama: Tennessee Valley Authority, 1973) ; D. J . Bogue, A Methodological Study o f Migration and Labor Mobility (Oxford, Ohio: Scripps Foundation for Research in Population Problems, 1952); L. E. Gallaway, “Industry Variation in Geographic Labor Mobility Patterns,” Journal of Human Resources, Vol. 2 (1967), pp. 461-74; L. E. Gallaway, Znferin- dustry Labor Mobility in the United States, 1957 to 1960, Research Report No. 18 (Washington, D. C.: U. S. Department of Health, Education, and Welfare, Social Security Administration, 1967); L. E. Galla- way, “Geographic Flows of Hired Agricultural Labor, 1957 to 1960,” American Journal of Agricultural Economics, Vol. 50 (1968), pp. 199-212; L. E. Gall- away, Geographic Labor Mobility in the United States, 1957-1960, Research Report No. 28 (Wash- ington, D. C.: U. S. Department of Health, Education, and Welfare, Social Security Administration, 1969) ; L. E. Gallaway, “Age and Labor Mobility Patterns,” Southern Economic Journal, Vol. 36 (1969/1970), pp. 171-80; L. E. Gallaway et al., “The Economics of Labor Mobility: An Empirical Analysis,” Western

Page 15: SPATIAL INTERACTION DATA

574 REVIEW ARTICLE December

gate industrial mobility, migration in the Tennessee Valley Authority area, migration in the context of regional investment decisions, and urban area growth.64

These data are close to being a continuous population register for the United States, but they have several limitations for the study of migration streams.65 First, the data relate to change in employment rather than residence, and relocation of the employer will be recorded as a migration of employees. The differentia- tion between place-of-work and place-of-resi- dence of employee may, however, be minimized by using areal units based on commuting re- gions. Secondly, about ten percent of all em- ployed people are omitted from the sample be- cause federal and some other government workers, railroad employees, and some self- employed persons are not covered by the na- tional social security system.G6 Thirdly the data

Economic Journal, Vol. 5 (1966/1967), pp. 211-23; P. A. Morrison, “Chi-onic Movers and the Future Re- distribution of Population: A Longitudinal Analysis,” Demography, Vol. 8 (1971), pp. 171-84; C. B. Sap- pington and L. L. Bauer, Income and Mobility of Tennessee Farm and Non-Farm Workcrs, 1960-1 965, Bulletin 471 (Knoxville, Tennessee: University of Tennessee, Agricultural Experiment Station, 1970); W. G. Smith et al., Metropolitan Labor Force Migra- tion in the Southeast 1960-1965, Bulletin Y-39 (Mus- cle Shoals, Alabama: Tennessee Valley Authority Mobility Studies, 1971); C. E. Trott, “Differential Responses in the Decision to Migrate,” Papers of the Regional Science Association, Vol. 28 (l972), pp. 203-19; and P. H. Vernon, “Distance Selectivity of the United States Labor Force Migration, 1960-1963,” Proceedings, Association of American Geographers, Vol. 1 ( l969) , pp. 153-56.

G4 P. A. Morrison, Uses of the Social Security Work History Sample in Studying Metropolitan Migration, Paper P-4869 (Santa Monica, California: The Rand Corporation, 1972); W. G. Smith and R. A. Matson, Mobility of the Tennessee Valley Labor Force 1957- 1963, Bulletin Y-23 (Muscle Shoals, Alabama: Ten- nessee Valley Authority Mobility Studies, 1971); W. G. Smith et al., in The Labor Force: Migration, Earn- ings, and Growth, op. cit., footnote 63, pp. 36-61; and G. J. Stolnitz, “U. S. Interindustry Mobility 1960- 1968,” in The Labor Force: Migration, Earnings, and Growth, op. cit., footnote 63, pp. 108-09.

6n The data are reviewed by S. A. Rubin, “OASDI Data: Its Scope and Limitations in Economic Anal- ysis,” in The Labor Force: Migration, Earnings, and Growth, op. cit., footnote 63, pp. 71-76; and Stolnitz, op. cit., footnote 64.

66 Rubin, op. cit., footnote 65, states that 9.0 per- cent of all employees in 1968 were wage and salary workers not reported to the Social Security Admini- stration, and 1.7 percent were self-employed persons not reported.

are based on a one percent sample, and sam- pling errors are large for counties with small numbers of employees. The sample guarantees a one percent coverage for each county, but the error factor varies with the size of the county, and it is necessary to amalgamate the data to larger units or concentrate on large areasG7

Limitations are also caused by the omission of the occupation of the workers, the high cost of the data, the expense of extracting informa- tion from the basic data files, and changes in definitions over time.G8 The benefits of the data, which offset their weaknesses, are the po- tential for undertaking cross-sectional and longitudinal studies, the feasibility of working at a fine areal scale, and the ability to aggregate data to meaningful areal units.

In overview, there is a considerable amount of quite reliable flow information for studies of population redistribution. Provided several assumptions are made, or screening is careful, it is possible to undertake locationally detailed empirical studies of migration streams, and analysis of the implications of population re- distribution for policy on population distribu- tion. G!’

67 Smith et al., op. cit., footnote 63. G. Eubanks in The Labor Force: Migration,

Earnings, and Growth, op. cit., footnote 63, pp. 77- 81.

1;:) Unlike interregional commodity flows, the meth- odology and structure of models for predicting migra- tion streams is not well advanced; much of the recent effort in this area has been directed more towards particular tools: L. A. Brown, “On the Use of Markov Chains in Movement Research,” Economic Geog- raphy, Vol. 46 (1970), pp. 393-403; Brown and Holmes, op. cit., footnote 4; Brown, Odland, and Gol- ledge, op. cit., footnote 4; R. C. Y. Ng, “Recent Inter- nal Population Movement in Thailand,” Annals, As- sociation of American Geographers, Vol. 59 (1969), pp. 710-30; A. Rogers, “Matrix Analysis of Inter- regional Population Growth and Distribution,” Papers of the Regional Science Association, Vol. 18 (1967), pp. 177-96; A. Rogers, Matrix Analysis of Inter- regional Population Growth arid Distribution (Berke- ley, California: University of California Press, 1968) ; Tarver and Gurley, op. cit., footnote 57; Tarver and Skees, op. cit., footnote 57; K. Thomlinson, “A Model for Migration Analysis,” Journal of the American Sta- tistical Association, Vol. 56 (1961), pp. 675-86; and L. Yapa and J . Wolpert, “Time Paths of Migration Flows: Belgium, 1954-1962,” Geographical Analysis, Vol. 3 (1971), pp. 157-64. To some extent conceptual developments are ahead of the available data: S. M. Golant, “Adjustment Process in a System: A Behav- ioral Model of Human Movement,” Geographical Analysis, Vol. 3 (1971), pp. 203-20; A. L. Mabogunje,

Page 16: SPATIAL INTERACTION DATA

1974 REVIEW ARTICLE 575

CONCLUDING C O M M E N T

Model development is proceeding more quickly than data development in many areas of spatial interaction analysis. Although models are themselves only generalizations with an implicit error element, the development of models of interregional economic dependence, conceptual frameworks of field theory applied to different phenomena, and models of spatial efficiency in population distribution have moved ahead of the requisite data base. Even the less demanding notions of dominance, functional

“Systems Approach to a Theor-y of Urban-Rural Mi- gration,” Ceograplrical Anulysis, Vol. 2 ( l970), pp. 1-18; G. Olsson, “Central Place System, Spatial Inter- action, and Stochastic Processes,” Papers of the Re- gional Science Association, Vol. 18 (1967), pp. 13-45; C. C. Roseman, “Migration, the Journey to Work, and Household Characteristics: An Analysis Based on Non-areal Aggregation,” Economic Geography, Vol. 47 (1971). pp. 467-74; J. Wolpert, “Decision Making in a Spatial Context,” Annuls, Association of Ameri- can Geographers, Vol. 54 (1964), pp. 537-58; L. Yapa et al., “Interdependence of Commuting and Migration,” Proceedings, Association of American Geographers. Vol. 1 (1969), pp. 163-68; L. Yapa et al., “Interdependences of Commuting, Migration, and Job Site Relocation,” Economic Geography, Vol. 47 (197 1 ), pp. 59-72; and W. Zelinsky, “The Hypothesis of the Mobility Transition,” Geographical Review, Vol. 61 (1971), pp. 219-49.

regions, and distance decay in spatial inter- action cannot be vigorously tested for shortage of good data.

This lack is understandable because of the costs of obtaining comprehensive and detailed data for a country as large and complex as the United States, but efficient planning and policy- making, let alone academic research, ought surely to be based on sound data. Those sta- tistical series which do exist need improvements. Providers of spatial interaction data are oriented towards reducing costs and sampling errors rather than improving coverage. Users seem to pay scant attention to sampling and other errors, although they are aware of omission problems. Data suppliers are not primarily concerned with spatial aspects, yet users for spatial analysis have not placed enough em- phasis on the effects of this bias. The many compromises between details of the phenom- enon versus areal detail, and the compromises among the relative viewpoints of supplier and user do, on balance, operate against the de- velopment of a sound empirical foundation for spatial interaction analysis.

DEREK THOMPSON Dr. Thompson is Associate Professor of

Geography at th. University of Maryland in College Park, M D 20742.