On the diversity of diversity; is there still a poace for small area classification?

Area (2001) 33.3, 329–336

Comments

The diversity of diversity: is there still aplace for small area classifications?Rich Harris, School of Geography, Birkbeck College,7-15 Gresse St., LONDON. W1P 2LLEmail: [email protected] received 12 April 2001

Introduction

In a recent issue of Area (2001, 33. 1), Voas andWilliamson (2001) present a critique of geodemo-graphics. They argue that ‘small areas are different inmany different ways’ (p.74) and that most of thisdiversity remains, even after geodemographic classi-fication (p.63). In the context of geodemographics, asmall area is defined as a census enumeration district(ED), containing on average about 150 households,though varying in size from about 20 to 200 houses(Openshaw 1995). For the 1991 Census over100 000 enumeration districts (EDs) were defined inEngland and Wales. In broad terms, a geodemo-graphic classification is a quantitative grouping ofthese EDs into a much reduced number of classes or‘neighbourhood types’ (typically between 10 and150). Ideally, this grouping would be conducted on alike-with-like basis. However, Voas and Williamsondemonstrate that after geodemographic classifi-cation the differences between the various neigh-bourhood classes are frequently less than thedifferences within the neighbourhood classes. Inother words, although two EDs may be assigned tothe same geodemographic group, the actual simi-larities between these two small areas may, in fact,be slight.

The diversity of diversity

It is apparent from Voas and Williamson’s researchthat EDs assigned to any one geodemographic classare, as a group, characterized more by diversity thanby uniformity. Worse still, it is possible to ‘dig deeper’and show that it is not only these coarse, geodemo-graphic classes that exhibit heterogeneity, but alsothe EDs themselves. For example, Longley and Harris(1999) undertook a geodemographic classification

of households — not EDs — using commercial,marketing data for a study region of Bristol, England.In their analysis, 51 882 households were sorted onthe basis of 241 socio-economic and behaviouralvariables into just 16 different groups, using a like-with-like algorithm. The 16 groups were then used toexamine the heterogeneity of EDs in the Bristolregion. The results showed ‘a quite staggering diver-sity of lifestyles within EDs’ (p.867) with 12 or moreof the 16 groups present in the majority of smallareas. Such findings raise three important questions:Are present geodemographic classification tech-niques ineffective? If so, can they be refined orimproved? Or, is the task of defining and compari-ng small areas — of small area classification —ultimately futile?

Are geodemographic classification techniquesineffective?The answer to this question really depends on theusers’ needs. Although Voas and Williamson suggestthe ‘near-impossibility of defining a good general-purpose classification’ (p.73), surely the rapid growthof the geodemographic industry (Sleight 1997) andits involvement in a wide range of applications istestimony to the fact that as far as many are con-cerned, good, general-purpose classifications havebeen achieved? It is easy to under-value the appealof geodemographics as a simple, analytical toolwhich is easy to apply, easy to understand, andeasy to explain to the Managing Director! Aneditorial comment written in response to Harris’(1999a) review of geodemographic classifications isindicative:

the elegant simplicity of the insights that can still begained . . . by a simple [geodemographic] analysis . . .continues to impress this practitioner. (Whitehead1999, 109).

In response to Openshaw’s presentation of an ‘Intel-ligent Geodemographic Thinking Machine’, Reed(1999, 32) writes, ‘there is no incentive for such amassive improvement . . . as long as [an] incrementalgrowth [of, say, five percent] can be achieved year-on-year, existing approaches will be used and modi-fied’. However, whilst a five per cent efficiency —that is, a 95 per cent inefficiency — might be

ISSN 0004-0894 � Royal Geographical Society (with The Institute of British Geographers) 2001

sufficient to gain a competitive advantage over a rivalretailer, it is clearly not satisfactory when census-based classification techniques are implicated in therationalization (closure) of bank branches, or in thedistribution of government spending to apparentlyimpoverished or disadvantaged areas.

Can the classification techniques be improved?Based on the view that ‘clearly it can be useful toclassify residential neighbourhoods’ (p.74), the issuebecomes ‘how best to do it’. Voas and Williamson’spreferred approach is for task-specific classifications:‘in view of the relative ease with which purpose-builtindices can now be generated, it may be reasonableto classify areas according to the demands of theapplications’. The basis for this view is the truism thata tailor-made ‘outfit’ ought always to fit better thanan off-the-shelf design (albeit that the off-the-shelfpackage could probably be worn by a greaternumber of users). Yet, it is also true to say that thesuitability of the fit will be limited by the choice ofmaterial to work with, no matter how it is weaved.Census-based classifications will always be limitedby:

+ the ageing of the data;+ a limited and narrow range of socio-economic

and demographic variables; and+ the scales of aggregation at which the data are

disseminated.

New material, specifically the increased availability ofsub-ED microdata, offer opportunity for householdand individual-level classifications of a geodemo-graphic style. Since the classification is no longerundertaken at the area level, so different individualsor households within any one ED can be assigned todifferent classes. This means it is no longer necessaryto assume EDs have uniform populations; indeed thefine-scale classifications can be used to investigatethe heterogeneity of EDs, as Longley and Harris(1999) have shown. However, Longley and Harris donot really consider the uniformity, or otherwise, ofthe populations within each household grouping.The problem of diversity within each geodemo-graphic class no doubt remains.

In a commercial setting, marketing firms havemodelled the attributes of individuals and/orhouseholds based on information contained in datawarehouses. These ’warehouses’ are supplied byinformation from postal questionnaires, loyalty carddata, customer records of retail companies, productwarranties and guarantee registrations, or the UK

Electoral Register, amongst other sources. Often,these sorts of non-governmental data are collectivelyreferred to as lifestyle data and are used to createdigital profiles of individuals or their households. Alike-with-like grouping of these digital personae intoa small number of consumer types produces ageolifestyle classification (at the individual or house-hold scale). In the era of one-to-one and directmarketing, geolifestyle classifications are increasinglysupplanting their geodemographic forebears in manyareas of retail analysis.

Is the task of defining and comparing small areasultimately futile?A premise of geodemographic classification is that‘birds of a feather flock together’ (Flowerdew andLeventhal 1998). That adage expresses similar senti-ment to Tobler’s ‘first law of geography’: everythingis related to everything else, but near things are morerelated than far things (Tobler 1970). Both essentiallyappeal to the existence of some sort of ’neighbour-hood effect’. However, if small areas are markedmore by a diversity of population than by a uniform-ity, then in what sense is the evidence for suchneighbourhood effects anything other than anec-dotal? Certainly, there is need to recognize that‘cities are increasingly intersections of multiple websof economic and social life, many of which do notinterconnect’ (Know and Pinch 2000, 71, after Aminand Graham 1997). However, the fact that ‘wecannot generalize about the city’ (Knox and Pinch2000, 71 original emphasis) does not mean that localpatterns of a given attribute, behaviour or phenom-ena cannot — or should not — be detected, com-pared or explored. For example, Figures 1 and 3show the results of a simple GIS-based analysis ofcommercial, lifestyles data for the Brent district ofLondon. Both are based on information compiledfrom a nation-wide, consumer survey undertaken inJanuary to March 1999, to which one million house-holds responded. Figure 1 shows the number ofhouseholds, within 500 metres of each postcodecontaining at least one survey respondent, who havean annual household income of less than £10 000.Figure 2 shows the number of households that arewithin 500 metres and that have access to satellitetelevision channels. Figure 3 shows householdswithin 500 metres who do not own a car. There aresome clear geographical patterns in the data, thoughthese obviously differ slightly, depending on theattribute that is modelled and the way it is classified.Note that the geographies shown in Figures 1 to 3

330 Comments

Figure 1 GIS analysis of lifestyles data, showing local density of low income households across the Brent region

Com

ments

331

Figure 2 GIS analysis of lifestyles data, showing local density of households having satellite TV across the Brent region

332C

omm

ents

Figure 3 GIS analysis of lifestyles data, showing local density of households without a car across the Brent region

Com

ments

333

shows some congruity with a Ward-level neighbour-hood statistic downloaded from the Office ofNational Statistics (ONS) website: www.statistics.gov.uk. The neighbourhood statistic shown is theDepartment of the Environment, Transport and theRegions’ Index of Multiple Deprivation (DETR 2000).

Conclusions: is there still a place for the‘first law of geography’?

In a multilevel regression analysis of householdincome data for Bristol, Harris (1999b) partitionedthe diversity in the data (the variance) at threehierarchical levels:

1 between ten geodemographic classes, at level 3;2 between EDs within the classes, at level 2; and3 between households within the EDs, at level 1.

A proportion of 15 per cent of the total diversitywas found to be at the geodemographic level; anappreciable and statistically significant amount.However, most of the diversity (74%) was at thehousehold level (the remaining 11% were at the EDlevel). These and other findings (for example,Longley and Harris 1999) lend support to Voas andWilliamson’s contention that ‘the analysis of spacemay point to the individuality of place’ (p.74).Indeed, they may actually suggest that the analysisof space points to the individuality of individuals.Nevertheless, individuals do not live in a social oreconomic vacuum from each other. Significant,neighbourhood effects appear to persist, andTobler’s first law of geography seems as valid aswhen it was written: yes, things may be chaotic at amicro-level where everything appears related toeverything else; yet, out of this chaos often emergessome sense of geographic order, albeit partial andshort-lived.

Conventional geodemographics classify (not so)small areas, at a coarse level of resolution, using datathat are only updated once every ten years. It isinevitable that they fail to capture many aspects ofsmall area diversity. Perhaps the fact that they workat all points to the deeply entrenched nature of somegeographical patterns and effects? On the otherhand, and despite the prefix, there is little geographyin a geolifestyle classification. Factor, principal com-ponents and various sorts of cluster analysis allproject the data into a conceptual space that isdefined by reference to the classificatory variables. If,on a binary scale, one household owns a car (1),whilst another does not (0), then their ‘distance’

apart on the dimension defined by car ownership, isone. This value does not have any geographicalmeaning since the two households could just as wellbe a million miles apart as be neighbours. Theirgeographical distance is not considered to alter thefact that one owns a car whilst the other does not. Atbest, geographical distance will be included as ‘justanother variable’ in the data, having no more weightin the analysis than the car ownership dimension, orany other variable selection. It is only after theclassification, when the results come to be mapped,that geography assumes any real importance.

Geolifestyle analysis, then, classifies individuals ortheir households directly, but using techniques thatare inherently aspatial. What is needed are ‘bottom-up’ approaches to small area classification that areflexible, relevant to the task at hand, and that handlethe new sources of up-to-date, detailed and dis-aggregate data in geographically sensitive ways (andin ways which are also sensitive to the privacy andconfidentiality concerns raised by individual/household-level analyses). Fortunately, new tech-nologies and new paradigms of thinking within thegeographical information sciences are workingtowards these end-goals (see, for example, Atkinsonand Martin 2000; Fotheringham et al. 2000; Halls2001; Openshaw and Abrahart 2000).

Acknowledgements

The author is grateful to Claritas UK for supplying thelifestyles data shown in Figures 1, 2 and 3. The data are �Claritas (UK) and used with permission. Any errors resultingfrom the analysis are my own.

References

Amin A and Graham S 1997 The ordinary city Transactionsof the Institute of British Geographers 22 411–29

Atkinson PM and D Martin eds 2000 GIS and geocompu-tation Taylor and Francis, London

DETR 2000 Index of multiple deprivation http://www.regeneration.detr.gov.uk/

Flowerdew R and Leventhal B 1998 Under the microscopeNew Perspectives 18 36–8

Fotheringham S, Brunsdon C and Charlton M 2000Quantitative geography: modern spatial analysis andgeographic information science Sage, London

Halls P ed 2001 Spatial information and the environmentTaylor and Francis, London

Harris R 1999a Geodemographics and geolifestyles: acomparative review Journal of Targeting, Measurementand Analysis for Marketing 8 164–78

— 1999b Geodemographics and the analysis of urbanlifestyles, unpublished PhD thesis, School of Geography,University of Bristol

334 Comments

Knox P and Pinch S 2000 Urban social geography: anintroduction fourth edition Prentice Hall, Harlow

Longley P A and Harris R 1999 Towards a new digital datainfrastructure for urban analysis and modelling Environ-ment and Planning B: Planning and Design 26 855–78

Openshaw S ed 1995 Census users’ handbook GeoInfor-mation international, Cambridge

Reed D 1999 To be the best. Database Marketing 7 31–3Openshaw S and Abrahart R J 2000 Geocomputation

Taylor and Francis, LondonSleight P 1997 Targetting customers: how to use geodemo-

graphic and lifestyle data in your business second editionNTC Publications, Henley-on-Thames

Tobler W 1970 A computer movie Economic Geography46 234–40

Voas D and Williamson P 2001 The diversity of diversity: acritique of geodemographic classification Area 33 63-76

Whitehead J 1999 Editorial Journal of Targeting, Measure-ment and Analysis for Marketing 8 108–10

Response (The diversity of diversity)David Voas and Paul Williamson, Department ofSociological Studies, University of Sheffield, SheffieldS10 2TU and Department of Geography, Universityof Liverpool, Liverpool L69 7ZTEmails: [email protected] and [email protected]

Harris asks if there is still a place for small areaclassifications, answering (as we do) with a cautious‘yes’. Our paper was intended to justify thatcaution, warning against the sufficiency of any singletaxonomy. This response to Harris’ comments pro-vides an opportunity to clarify our thoughts on boththe advantages and shortcomings of adopting‘problem-specific’ or ‘general-purpose’ area classifi-cations. We also suggest some additional ways inwhich new classifications could take advantage ofthe emerging microdata described by Harris.

A primary reason for preferring problem-specificarea classifications is that they are more accurate:areas that are genuinely similar in two or threerelevant ways can be identified. Within a moregeneral-purpose classification, similarity along manydimensions is unlikely. By conflating a range ofmarginally correlated measures, such as income andnewspaper readership, there is an inherent tendencyto obscure the actual between-area differences onmatters of specific interest.

But in reiterating this point we may be pushingagainst an open door. Increasingly users are turningto other, tailor-made, classifications. In the publicsector deprivation indices are in resurgence, driven

by the release of the freely available, annuallyupdated DETR Index of Multiple Deprivation (DETR2000). This new indicator notably contains not only asummary index, but also indices tailored to measurespecific dimensions of deprivation. In the commer-cial sector, increasing use is being made of thedirectly measured small area income estimates nowavailable (for a cost) from a range of independentproviders. One of the key reasons for the historicsuccess of the main geodemographic systems is that,as our paper demonstrated, these so-called ‘generalpurpose’ classifications are in reality dressed-upindicators of affluence.

While the time may have come for replacinggeneral with purpose-built classifications, even thelatter approach can still leave considerable within-category diversity. The potential gains to be made byreducing dimensionality should not be over-stated.For example, using data from the 1999 CensusRehearsal we have recently shown that even in thepoorest 20 per cent of EDs a significant minority ofhouseholds are headed by earners in the top incomedecile. Conversely, households in the bottomincome decile may be found located in the mostaffluent EDs (Williamson and Voas 2001). Similarexamples, spanning a range of socio-demographicattributes, are provided in Harris’ comments, Voasand Williamson (2000), and Tranmer and Steel(2001).

A word of warning is appropriate here. Taken toan extreme, each household or person could beuniquely categorized, but in so doing one wouldrisk committing the atomistic fallacy (the reverse ofthe ecological fallacy). Neighbours often do sharelimited, if important, attributes in common. Forexample, residents in many pockets of Liverpoolshare the common experiences of low income,non-participation in local elections, and non-compliance with the requirement to complete aCensus form. Whether these similarities are moreimportant than differences along lines such as age,gender and ethnicity depends entirely upon thepurpose for which classification is being made.

For this reason, perhaps we should not be tooready to accept that ‘[all] small areas are markedmore by a diversity of population than a uniformity’(p.333). There is clear evidence that some areas arerelatively homogeneous, at least when measuredalong selected important dimensions. An alternativeviewpoint is not to deny that areas of homogeneityexist, but to recognize that they are surprisinglyrare, and far out-numbered by areas that exhibit

Comments 335

Documents

On the diversity of diversity; is there still a poace for small area classification?