‘Improving the ethnic classification of patient registers’ Centre for Advanced Spatial Analysis,...

Preview:

Citation preview

‘Improving the ethnic classification of patient registers’

Centre for Advanced Spatial Analysis, UCL

25th May, 2005

Objectives of seminar

1. Promote awareness of existing tools for targetting health communications at ‘ethnic’ groups

a. Individual patients on registersb. Surgeries and other contact pointsc. Local promotional activities

2. Discussion of ideas for collaborative use of individual patient registers to improve the quality of ‘ethnic’ coding

3. Forum for exchange of informationa. Methodsb. Ethics and other data protection issues

Our conception of ‘Ethnicity’

Multi-variate classification based on a combination of :

Cultural Origin – eg religion, beliefs

Ethnicity – eg country of origin, diet

Language

Background to this seminar

Context : CASA (1)

• 1970s - Neighbourhood classifications used for prioritising public sector initiatives

• 1990s - Application of postcode classifications adopted by commercial companies

• 2002 – CASA becomes involved in the application of Mosaic in health, policing and education 2003

• CASA work with Dr Foster on Slough Diabetes project

Context : CASA (2)

• 2004 – CASA sets up Knowledge Transfer Partnership (KTP) with Camden PCT to develop health applications of geodemographics

• 2004 – CASA wins ESRC grant for ‘quantitative analysis of names’

• 2005 – Camden PCT develops capability in the application of ‘names’ as well as Mosaic to targetting of public health campaigns

Contact details

• CASA website• E/mail addresses

– Pablo Mateos – p.mateos@ucl.ac.uk– Richard Webber – richardwebber@blueyonder.co.uk– Paul Longley –p.longley@geog.ucl.ac.uk

www.casa.ucl.ac.uk/geonom

‘Quantitative Analysis of Names’

• ESRC funded project– Use of surname as an identifier of cultural

origin• Regional origins of English names• Regional distribution of Celtic names• Current locations of names ‘imported from abroad’

– Jewish– Continental European and Hispanic– Asian– African– Middle Eastern

Identification of potential applications

• Academic / Social Scientific– Study of meaning of names– Studies of historic migration patterns– Social mobility of Celtic migrants to England

• Policy applications– Measurement of ‘social capital’– Differentiation of crude ‘South Asian’ definition– Targetting of public sector communications programmes– Auditing of equal opportunities in employment

Key data files

• 40 million records Experian 1996 GB electoral roll– First name– Surname– Postal area code– Mosaic code

• 26 million records 1881 census

• Summary statistics on name frequencies by region from Anglophone diaspora– US, Canada, Australia, New Zealand, North and Southern

Ireland

Geography of the name ‘Webber’

% electors with occupational names

% electors of Welsh surnames

CEL assignment : Phase one

• Identify 25,000 surnames with > 100 occurrences in 1996

• Assign to hierarchy– English; general name type; detailed name

type– Celtic; country of origin; general type– Imported from abroad; region of origin;

country of origin

Webber

• Level one : English ‘metonym’

• Level two : Metonym ending in ‘-er’

• Level three : Manufacturing occupation

Zhang

• Level one : Imported from abroad

• Level two : East Asian

• Level three : Chinese

Names > 100 occurrences Count

English 19,246

Celtic 3,396

Imported from abroad 2,987

Total 25,630

Muslim and South Asian names (1462)

IMPORTED FROM ABROAD;MUSLIM;AFGHAN 19

IMPORTED FROM ABROAD;MUSLIM;BANGLADESHI 83

IMPORTED FROM ABROAD;MUSLIM;ERITREAN 3

IMPORTED FROM ABROAD;MUSLIM;LEBANESE 2

IMPORTED FROM ABROAD;MUSLIM;MIDDLE EASTERN 125

IMPORTED FROM ABROAD;MUSLIM;NORTH AFRICAN 33

IMPORTED FROM ABROAD;MUSLIM;PAKISTANI & IRANIAN 203

IMPORTED FROM ABROAD;MUSLIM;PERSONAL NAME 85

IMPORTED FROM ABROAD;MUSLIM;SOMALI 40

IMPORTED FROM ABROAD;MUSLIM;SUDANESE 1

IMPORTED FROM ABROAD;MUSLIM;TURKISH 88

IMPORTED FROM ABROAD;OTHER SOUTH ASIAN;HINDI 254

IMPORTED FROM ABROAD;OTHER SOUTH ASIAN;NEPALESE 2

IMPORTED FROM ABROAD;OTHER SOUTH ASIAN;NORTH INDIAN 184

IMPORTED FROM ABROAD;OTHER SOUTH ASIAN;SIKH 290

IMPORTED FROM ABROAD;OTHER SOUTH ASIAN;SOUTH INDIAN & SRI LANKAN 50

Phase one assignment method(25,000 names with > 100 occurrences)

1. General knowledge

2. Identification of top postal area and level of concentration in it

3. Identification of top Mosaic type and level of concentration in it

4. Identification of concentration in 1881

5. Frequencies in other Anglophone countries

C20 : Suburban Comfort / Asian Enterprise

Wakemans Hill, Colindale, NW9 0UU

The Warren, Heston, TW5 0JW Headcorn Road, Thornton Heath, CR7 6JS

Himley Crescent, Wolverhampton, WV4 5DA

D26 : Ties of Community / South Asian Industry

Aberdeen Place, Bradford, BD7 2HG

Ivy Road, Luton, LU1 1DL Edmundson Road, Blackburn, BB2 1HL

Osborn Road, Sparkbrook, Birmingham, B11 1TT

Status and Asian names

Asian Enterprise

South Asian Industry

Mayat 145 9533

Lorgat 971 8840

Gorasia 7622 275

D27 : Ties of Community / Settled Minorities

Algernon Road, Lewisham, SE13 7AP

Essex Road, Leyton, E10 6BT Guildersfield Road, Streatham, SW16 5LS

Melbourne Road, Walthamstow, E17 6LR

F36 : Welfare Borderline / Metro Multiculture

Broadwater Farm, Tottenham, N17 6HT

Hillcrest, Highgate, N6 4EX Kenninghall Road, Lower Clapton, E5 8DG

Samuel Street, Woolwich, SE18 5LJ

Output

• Directory assigning a Cultural / Language / Ethnicity code to each name with more than 100 occurrences on the GB electoral roll

Phase two assignment(all surnames > 5 occurrences)

• Rank first names by frequency

• Allocate names to CEL categories where possible

• Identify for each surname the proportion of associated first names in known CEL categories

Selected first names

Agapios Antigoni Sotiri Sotiris

Total occurrences

10 57 11 224

Surnames not British

6 34 7 143

Surnames Greek

3 19 5 75

Output

• Database giving for 60,000 surnames ‘imported from abroad’– % electors by CEL of first name– Most common cell (three level hierarchy)

• Database giving for 60,000 first names ‘imported from abroad’ – (3.2m occurrences)– % electors by CEL of surname– Most common cell (three level hierarchy)

Evaluation of solution• Seems to work well for all ethnic groups other than Caribbeans• CEL overlaps between surname and first name

– South Asians and Muslims – 80%– Africans, Turks, Cypriots, Chinese – 50%– Hispanics – 20%– Other Europeans – 8 – 15%– Jew – 4%– Irish, Scots, Welsh – 3%

• High overlap between certain CELs – within Muslim group– Spain, Portugal, Italy– Netherlands, Germany and Czech Republic

• Confusion among serial migrant groups– Hispanic migrants to India– Chinese migrants to West Indies

Recommended