Comparing network and association models in the analysis of historical
patterns of occupational interactions and stratification
Paul Lambert1, David Griffiths1, Richard Zijdeman2, Ineke Maas2, Marco van Leeuwen2
Paper presented to the European Social Science History Conference, 11-14 April 2012, University of Glasgow, UK
1) University of Stirling, UK, contact email: [email protected]) University of Utrecht, Netherlands
1
Motivation• Studying social interactions and social connections can
help us to understand social trends and transformations • Social mobility; homogamy; industrialisation; etc
• Taking full advantage of historical occupational codes, new data, and new analytical opportunities
• HISCO/NAPPHISCO/Microclass standardised codes…• …capture fine-grained details, but potentially aggregate
some occupations by sector rather than level
– GB 1831 census “..occupational returns as ‘crude, undigested, and essentially unscientific’, a document ‘whose insufficiency is a national disgrace to us, for there the trading and working classes are all jumbled together in the most perplexing confusion, and the occupations classified in a manner that would shame the merest tyro’” [Thompson 1963: 25, citing Mayhew 1862] 2
What’s new?1) Data resources• Census returns with household sharers’
occupations as proxy for social distance
0.1
.2.3
.4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 20-30
USA, 1900, N=282158Number of jobs within household
% of cases % of jobs female
0.2
.4.6
1 2 3 4 5 6 7 8 9 10 12
Iceland, 1801, N=9071Number of jobs within household
% of cases % of jobs female
2) Occupational coding Originally in NAPP/PUMS codes (NAPPHISCO, or national unit)
(Approximate) recode into HISCO R Zijdeman; www.geode.stir.ac.uk(Approximate) recode into ‘Microclass’ D Griffiths; www.geode.stir.ac.uk
‘Microclass’ (Weeden and Grusky 2005; Jonsson et al. 2009) – socially defined fine-grained occupational clusters 3
Data sourcesCountry Years N cases (k) Source Occupations Derived Occs
Canada 1871, 1881, 1891, 1901
8; 1276; 156; 92
NAPP NAPPHISCO HISCO; Microclass
Iceland 1801, 1901 9; 34 NAPP NAPPHISCO HISCO; Microclass
Sweden 1900 1573 NAPP NAPPHISCO HISCO; Microclass
Britain 1851; 1881s; 1881ew
214; 2096; 13500;
NAPP OCCGB Microclass
USA 1850, 1860, 1870, 1880, 1900
53; 83; 121; 170, 282
PUMS US1880 HISCO; Microclass
Norway 1801, 1865, 1875,1900
228; 633; 286; 1037
NAPP NAPPHISCO HISCO; Microclass
N refers to number of adults in dataset with valid occupational records. The number of unique within household connections between these adults is usually between 1 and 2 times the number of adults.
4
1101. Jurists1102. Health professionals
1103. Professors and instructors1104. Natural scientists
1105. Statistical and social scientists1106. Architects
1107. Accountants1108. Journalists, authors, and related writers
1109. Engineers1201. Officials, government and non-profit organizations
1202. Managers1203. Commercial Managers
1204. Building managers and proprietors1304. Elementary and secondary school teachers
1305. Librarians1306. Creative artists
1307. Ship officers1308. Professional, technical, and related workers, n.e.c.
1309. Social and welfare workers1310. Workers in religion
1311. Nonmedical technicians1312. Health semiprofessionals
2001. Proprietors3101. Real estate agents
3102. Other agents3105. Sales workers and shop assistants
3201. Telephone operators3202. Bookkeepers and related workers
3203. Office and clerical workers3204. Postal and mail distribution clerks
4101. Craftsmen and kindred workers, n.e.c.4102. Foremen
4104. Printers and related workers4105. Locomotive operators
4106. Electricians4107. Tailors and related workers
4109. Blacksmiths and machinists4110. Jewelers, opticians, and precious metal workers
4111. Other mechanics4112. Plumbers and pipe-fitters
4113. Cabinetmakers4114. Bakers
4115. Welders and related metal workers4116. Painters
4117. Butchers4118. Stationary engine operators
Bricklayers, carpenters & related4120. Heavy machine operators
4202. Chemical processors4203. Miners and related workers
4204. Longshoremen and freight handlers4205. Food processors
4206. Textile workers4207. Sawyers and lumber inspectors
4208. Metal processors4209. Operatives and kindred workers, n.e.c.
4210. Forestry workers4301. Protective service workers
4302. Transport conductors4304. Food service workers
4305. Mass transportation operators4306. Service workers, n.e.c.
4307. Hairdressers4309. Launderers and dry-cleaners
4310. Housekeeping workers4311. Janitors and cleaners
4312. Gardeners5101. Fishermen
5201. Farmers and farm managers5202. Farm laborers
9990. Members of armed forces
0 2000 4000 6000 8000
Men
Women
Canada 1891, Males and females by microclass units. (5201/5201 downweighted by factor of 5).
Preliminary versions – contemporary microclasses a convenient way to measure and analysis fine-grained historical detail?
5
Sample Model CAM/USC Microclass HISCO NAPPHISCO (OCCGB)
CA 1871 R2 in predicting 0.155 0.247 0.270 0.303
CA 1881 alter’s HISCAM 0.194 0.279 0.309 0.310
CA 1891 0.299 0.404 0.433 0.437
CA 1901 0.143 0.252 0.280 0.283
IC 1801 R2 in predicting 0.060 0.137 0.166 0.167
IC 1901 alter’s HISCAM 0.009 0.032 0.043 0.043
SE 1900 ` ` 0.000 0.167 0.192 0.192
GB 1851 R2 in predicting 0.300 0.319 n/a 0.344
GB 1881 (EW) alter’s CAMSIS 0.236 0.258 n/a 0.282
GB 1881 (S) 0.189 0.228 n/a 0.245
US 1850 R2 in predicting 0.027 0.053 0.057 0.058
US 1860 alter’s literacy 0.026 0.059 0.065 0.066
US 1870 (plus father’s hiscam 0.067 0.145 0.151 0.151
US 1880 If literacy missing) 0.040 0.099 0.103 0.104
US 1900 0.032 0.069 0.075 0.076
NO 1801 R2 in predicting 0.067 0.115 0.156 0.157
NO 1865 alter’s HISCAM 0.028 0.064 0.081 0.081
NO 1875 0.057 0.099 0.116 0.117
NO 1900 0.084 0.162 0.180 0.1816
What’s new?3) Methods for analysing {within-household} social
connections on large-scale and fine-grained data
…Focus on the individual outcome.. Model with occupation-based indicators
(plus random or fixed effects)
…Focus on the social connection.. Association models
• HISCAM (Lambert et al. 2012)• Chan (2010) on ‘status’ scales
Network analysis• ‘SONOCS’ (Griffiths & Lambert 2011) • Cf. Wellman & Berkowitz (1988)
Characterise dimensions to the occupational
interaction structure
Identify particular ‘routes’ of
occupational connects 7
9999 5202 5201 5101 4312 4310 4304 4301 4209 4206 4205 4120 4119 4116 4115 4114 4113 4110 4109 4107 4104 4102 3204 3203 3202 3102 2001 1312 1310 1308 1307 1306 1304 1202 1201 1108 1105 1104 1103 1102 1101
1101
1102
1103
1104
1105
1108
1109
1201
1202
1304
1305
1306
1307
1308
1310
1312
2001
3102
3202
3203
3204
4102
4104
4107
4109
4110
4113
4114
4115
4116
4119
4120
4205
4206
4208
4209
4301
4304
4306
4310
4311
5101
5201
5202
9999
(Excluding diagonals)
Iceland, 1901Microclasses
8
9999 5202 5201 5101 4312 4310 4304 4301 4209 4206 4205 4120 4119 4116 4115 4114 4113 4110 4109 4107 4104 4102 3204 3203 3202 3102 2001 1312 1310 1308 1307 1306 1304 1202 1201 1108 1105 1104 1103 1102 1101
1101
1102
1103
1104
1105
1108
1109
1201
1202
1304
1305
1306
1307
1308
1310
1312
2001
3102
3202
3203
3204
4102
4104
4107
4109
4110
4113
4114
4115
4116
4119
4120
4205
4206
4208
4209
4301
4304
4306
4310
4311
5101
5201
5202
9999
(Excluding diagonals and pseudo-diagonals)
Iceland, 1901Microclasses
9
9999 9990 5202 5201 5101 4312 4311 4310 4309 4307 4306 4305 4304 4302 4301 4210 4209 4208 4207 4206 4205 4204 4203 4202 4120 4119 4118 4117 4116 4115 4114 4113 4112 4111 4110 4109 4107 4106 4105 4104 4102 4101 3204 3203 3202 3201 3105 3102 3101 2001 1312 1311 1310 1308 1307 1306 1305 1304 1204 1203 1202 1201 1109 1108 1107 1106 1105 1104 1103 1102 1101
1101
1102
1103
1104
1105
1106
1107
1108
1109
1201
1202
1203
1204
1304
1305
1306
1307
1308
1310
1311
1312
2001
3101
3102
3105
3201
3202
3203
3204
4101
4102
4104
4105
4106
4107
4109
4110
4111
4112
4113
4114
4115
4116
4117
4118
4119
4120
4202
4203
4204
4205
4206
4207
4208
4209
4210
4301
4302
4304
4305
4306
4307
4309
4310
4311
4312
5101
5201
5202
9990
9999
(Excluding diagonals, n*20)
Norway, 1865Microclasses
10
9999 9990 5202 5201 5101 4312 4311 4310 4309 4307 4306 4305 4304 4302 4301 4210 4209 4208 4207 4206 4205 4204 4203 4202 4120 4119 4118 4117 4116 4115 4114 4113 4112 4111 4110 4109 4107 4106 4105 4104 4102 4101 3204 3203 3202 3201 3105 3102 3101 2001 1312 1311 1310 1308 1307 1306 1305 1304 1204 1202 1201 1109 1108 1107 1106 1105 1104 1103 1102 1101
110
1 1
102 1
105 1
108 1
201 1
202 1
203 1
204 1
304 1
305 1
306 1
307 1
308 1
312 2
001 3
102 3
105 3
201 3
203 3
204 4
102 4
104 4
107 4
109 4
110 4
111 4
113 4
114 4
115 4
116 4
117 4
119 4
120 4
202 4
203 4
204 4
205 4
206 4
207 4
208 4
209 4
210 4
301 4
304 4
305 4
306 4
307 4
309 4
310 4
311 4
312 5
101 5
201 5
202 9
990 9
999
(Excluding diagonals, n*20)
Norway, 1865, m-fMicroclasses
11
99900 98990 98620 98190 98130 98100 97145 97125 96100 95510 95490 95455 95420 95300 95140 95130 95000 94990 94960 94290 94220 94170 94160 93120 92690 92625 92110 91030 90100 89242 89210 89190 89160 88090 87370 87350 87340 87330 87105 84222 84190 84100 83990 83920 83220 83210 83110 82020 81930 81925 81230 81120 80360 80320 80310 80190 79920 79600 79430 79400 79200 79190 79100 78200 78100 77810 77660 77610 77510 77310 77120 76150 75710 75600 75500 75457 75100 75000 74990 74930 74400 73400 73210 73200 72725 72000 71300 71110 71105 64960 64100 63190 63110 62740 62510 62490 62460 62210 61110 59220 58940 58320 58300 57030 56010 55140 54060 54010 53210 51050 51040 51030 51020 45290 45220 45125 43200 41040 41025 39350 39110 37040 33990 33140 33120 33110 32110 31000 22520 22290 21970 21960 21950 21240 21220 21200 21100 20210 17320 17140 17000 16310 16150 15915 15900 14140 13000 12110 6710 6310 6100 4200 2210 2120 2000
111
0 2
000
212
0 2
210
420
0 6
100
631
0 6
710
121
10
130
00
141
40
159
15
161
50
163
10
170
00
171
40
173
20
202
10
211
00
212
00
212
20
212
40
219
50
219
60
219
70
222
90
225
20
310
00
321
10
331
10
331
20
331
40
339
90
370
40
391
10
391
50
393
50
410
25
410
40
432
00
432
20
443
20
451
25
452
20
452
90
510
20
510
30
510
40
510
50
532
10
540
10
540
60
551
40
560
10
570
30
583
00
583
20
589
40
592
20
611
10
622
10
624
60
624
90
625
10
627
40
631
10
631
90
641
00
649
60
711
05
711
10
720
00
727
25
732
00
732
10
734
00
744
00
749
30
749
90
750
00
751
00
754
57
755
00
756
00
756
35
757
10
761
50
771
20
773
10
774
10
774
90
775
10
776
10
776
60
778
10
781
00
782
00
791
00
791
90
794
00
794
30
796
00
799
20
799
30
801
90
803
20
811
20
812
08
812
30
819
25
819
30
820
20
831
10
832
10
832
20
839
20
839
90
841
00
841
90
842
22
871
05
873
30
873
40
873
50
873
70
874
50
880
90
891
90
892
10
892
42
901
00
910
30
921
10
926
25
931
20
941
60
941
70
942
20
942
90
949
60
949
90
950
00
951
40
953
00
954
20
954
55
954
90
955
10
959
25
961
00
971
25
971
52
981
00
981
30
981
90
986
20
989
90
999
00
(n*4)
USA, 1850, m-m HISCO units
12
What can we do with such data?a) Statistical models of occupation-based outcomes b) Statistical models of the association processc) Network depictions of prevalence of connections
Intergenerational HISCAM (all m-m) R
Canada 1871=0.57; 1881=0.47; 1891=0.46; 1901=0.43
Iceland 1801=0.41, 1901=0.07
Sweden 1900=0.37
Britain 1851=0.21; 1881ew=0.36; 1881s=0.30
USA 1850=0.30; 1860=0.33; 1870=0.33; 1880=0.31; 1900=0.33
Norway 1801=0.23; 1865=0.23; 1875=0.29; 1900=0.27
4060
8010
0S
on
40 60 80 100Father
Values Perfect fit Regression fit
Father-Son Social Mobility [HISCAM, microclass, N=45k]Canada 1891
13
(a) Model individual outcomes: Linear/random/fixed effects
(1) (2) (3) (4) (5) (6)
OLS (1)+fath HISCAM
(2) + f.e. HISCO
(2) + f.e. microclass
(2) + r.e. HISCO
(2) + r.e. microclass
Age (linear) 29.5 32.1 35.7 34.6 35.7 34.5
Female -120.9 -127.2 -128.6 -130.1 -128.8 -130.1
Jewish 7.9 7.5 7.1 7.0 7.1 7.0
Sami 1.6 1.8 2.2 2.1 2.2 2.1
Finnish -2.0 -1.7 -1.7 -1.9 -1.7 -1.9
Urban 36.6 32.3 18.7 19.6 19.0 19.8
Cohabits -19.6 -18.5 -16.5 -17.0 -16.5 -17.0
Father’s HISCAM
37.5 5.4 3.6 6.5
Rho 0.197 0.038 0.086 0.026
r2 0.109 0.119
Data: Sweden 1900, N=124238, Child HISCAM predicted by father’s HISCAM. T-statistics.
14
(b) Association models‘Cambridge Social Interaction and Stratification Scales’ See www.camsis.stir.ac.uk/hiscam & Lambert et al. (2012) for historical data e.g.s
• Social Interaction Distance (‘SID’) analysis
• RC(II) model / Correspondence analysis
• First dimension of association can usually be labelled as ‘stratification’
15
How to use SID analysis effectively..?• Carefully prepared
specific analysis… • ..or semi-automated
comparisons?
• Fine- v’s coarse-grained analysis?
Scales scores can indicate change in occupations through context
Model fit statistics allow study of trends/structures
Fully automated , m-f homogamy, %inertia in dims 1+2
Fully automated , father-son, correlation to contemporary CAMSIS
Canada 1871=0.90; 1881=0.63; 1891=0.51; 1901=0.47
1871=0.38; 1881=0.44; 1891=0.56; 1901=0.64
Iceland 1801=0.94, 1901=0.73 1801=0.76, 1901=0.22
Sweden 1900=0.56 1900=0.11
Britain 1851=0.48; 1881ew=0.56; 1881s=0.53
1851=0.66; 1881ew=0.66; 1881s=0.10
USA 1850=. ; 1860=0.55; 1870=0.67; 1880=0.53; 1900=0.50
1850=0.01; 1860=0.16; 1870=0.03; 1880=0.12; 1900=0.62
Norway 1801=0.87; 1865=0.78; 1875=0.58; 1900=0.64
1801=0.68; 1865=0.49; 1875=0.65; 1900=0.20
16
0 20 40 60 80 100
52015101421042034113410199994107410943094204411543054205411041021301130913081307130511091107
Automated scales, selected scores onlyNorway, 1900 and 1801
Male score, 1900 Male score, 1801
• Main contribution of association models are to tell us about average social positions of the incumbents of occupations (and change over societies) 17
Canada Norway Scotland USA
Cases 123,749 54,067 261,187 22,349
Links 101 136 111 208
Microclasses (older cohort) 45 50 41 45
Microclasses (younger cohort) 35 38 39 41
Strongest bond (* times expectation) 239 146 19 55
Network: Degree centrality .10 .14 .12 .18
Network: Closeness centrality .23 .23 .27 .26
Network: Components 2 1 2 1
Network: Distance 10 12 7 5
Network: average distance 3.8 3.7 3.2 2.6
Note, for Canada and Scotland closeness centrality refers to largest component only.
c) Network analysis
Still looking at number of connections {within household} but change in emphasis on features of connections
Canada 1881
USA 1880
Scotland 1881
Norway 1876
Microclasses with ties *2 expected + non-sparse; male-male links if >16yrs age gap
Scotland 1881
Librarians (1305) and creative artists (1306) with links to printers (4104) and craftsmen
Housekeepers (4310)
Farming community (5201, 5202), forestry workers (4210) and gardeners (4312)
Managers (1202) and ships’ officers (1307) link to their subordinates (4306)
Clerks (3203) and agents (3102) interact with various professionals
Lawyers (1101), medics (1102), teachers (1304) and the clergy (1310) form a clique at centre of the network
Canada 1881
Ties not as obvious; sparse connections within mesoclasses, but stratification effects most observable
Farmers (5201) and farm labourers (5202) do not have mutual ties to forestry workers
Teachers (1304), clergy (1310), lawyers (1101) and medics (1102) have sparse ties
Clerical and sales workers (3***) strongly interact, but few ties to professionals (1***)
Librarians (1305) and creative artists (1306) don’t form any strong ties and aren’t represented
Food service workers (4304) are the ‘sons’ of many other routine workers
Housekeepers (4310)
Canada 1881 (left) with microclasses split by religion (red=catholic; white=non-catholic).
Clear division on religious grounds in 1881.
Canada 1891 (right) with microclasses split by religion (red=catholic; white=non-catholic).
Religious divide continues, but much more common for cross-religion and microclass households.
Canada (by religion) 1881 1891
Cases 92,048 22,084
% Roman Catholic 33.1% 28.6%
% Catholics with Catholic alter 84.1% 60.6%
% non-Catholics with Catholic alter 8.2% 17.4%
Mean HISCAM (All cases)(Standard deviation)
58.0 (10.9)
57.7 (11.4)
Mean difference in HISCAM (all cases)(Standard deviation)
9.2 (11.5)
10.1 (11.6)
% HISCAM difference<1/2 s.d.
…. (Catholic – Catholic) 52.0% 51.7%
… (non-Catholic to non-Catholic) 51.5% 49.3%
… (Catholic to non-Catholic) 45.5% 44.4%
% HISCAM difference>2 s.d.
… (Catholic to Catholic) 11.4% 16.6%
… (non-Catholic to non-Catholic) 12.8% 11.9%
… (Catholic to non-Catholic) 12.4% 11.8%
Summary: Social connections between occupations
• Connections are central to social organisation of the stratification system [e.g. Bottero 2005]
• Problems of data preparation and scale• Occupational coding – NAPP; HISCO; Microclass • Identify social connections (within hhld NAPP)• Select/discard some types of connections (e.g. farming)
• Analytical approachesModel with proxy indicators, random or fixed effects…Focus on the social connection..Association models Network analysis
24
References cited• Bottero, W. (2005). Stratification: Social Division and Inequality. London: Routledge.• Griffiths, D., & Lambert, P. S. (2011). Dimensions and Boundaries: Comparative analysis of occupational
structures using social network and social interaction distance analysis Paper presented at the ISA RC28 Spring meeting, University of Essex, 13-16 April 2011.
• Jonsson, J. O., Grusky, D. B., Di Carlo, M., Pollak, R., & Brinton, M. C. (2009). Microclass Mobility: Social Reproduction in Four Countries. American Journal of Sociology, 114(4), 977-1036.
• Lambert, P. S., Zijdeman, R. L., Maas, I., van Leeuwen, M. H. D., & Prandy, K. (2012). The construction of HISCAM: A stratification scale based on social interactions for historical research. Historical Methods, forthcoming.
• Mayhew, H. (1862) London Labour and the London Poor. • Thompson, E. P. (1980[1963]). The Making of25 the English Working Class. London: Penguin.• Weeden, K. A., & Grusky, D. B. (2005). The Case for a New Class Map. American Journal of Sociology,
111(1), 141-212.Data from: • Minnesota Population Center. (2011). Integrated Public Use Microdata Series, International: Version 6.1
[Machine readable database]. Minneapolis: University of Minnesota, and https://international.ipums.org/ (accessed 1 July 2011).
• North Atlantic Population Project and Minnesota Population Center. (2008). NAPP: Complete Count Microdata. NAPP Version 2.0 [computer files]. Minneapolis, MN: Minnesota Population Center [distributor] [http://www.nappdata.org]
25