40
Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past 2014년 한국물리학회 가을학술논문발표회 통계물리학분과회 [FG-09] SHL, R. Ffrancon, D. M. Abrams, B. J. Kim (김범준), and M. A. Porter, Phys. Rev. X 4, 041009 (2014). Sang Hoon Lee (이상훈) Department of Energy Science (에너지과학과) Sungkyunkwan University (성균관대학교) [email protected] http://sites.google.com/site/lshlj82

Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Embed Size (px)

DESCRIPTION

Matchmaker, Matchmaker, Make Me a Match: 
Migration of Populations via Marriages in the Past @ KPS Fall Meeting 2014

Citation preview

Page 1: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

2014년 한국물리학회 가을학술논문발표회 통계물리학분과회 [FG-09]

SHL, R. Ffrancon, D. M. Abrams, B. J. Kim (김범준), and M. A. Porter, Phys. Rev. X 4, 041009 (2014).

Sang Hoon Lee (이상훈) Department of Energy Science (에너지과학과)

Sungkyunkwan University (성균관대학교) [email protected]

http://sites.google.com/site/lshlj82

Page 2: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

from https://twitter.com/AcademicsSay/status/524571824492150784

Page 3: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

from https://twitter.com/AcademicsSay/status/524571824492150784KPS meeting author list

Page 4: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Kim from Gyeongju (경주 김) ≠ Kim from Gimhae (김해 김)

Korean family names are subdivided into 본관 named after their geographical origins

Page 5: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Kim from Gyeongju (경주 김) ≠ Kim from Gimhae (김해 김)

Korean family names are subdivided into 본관 named after their geographical origins

Page 6: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Tales of two Korean researchers’ origins

Sang Hoon Lee (이상훈)Beom Jun Kim (김범준)Kim from Gimhae (김해 김)

Lee from Hakseong (학성 이)

Page 7: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Tales of two Korean researchers’ origins

Sang Hoon Lee (이상훈)Beom Jun Kim (김범준)Kim from Gimhae (김해 김)

Lee from Hakseong (학성 이)

Page 8: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Tales of two Korean researchers’ origins

Sang Hoon Lee (이상훈)Beom Jun Kim (김범준)Kim from Gimhae (김해 김)

Lee from Hakseong (학성 이)

Page 9: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

family/origin population density as a percentage of total population density

Really?

from the census data (in 1985 and 2000—the only two years when the individual 본관’s regional data is available) showing the list of populations of 본관 for each administrative regions of South Korea, provided by Korean Statistical Information Service (KOSIS): http://kosis.kr/

Page 10: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

family/origin population density as a percentage of total population density

Really?

from the census data (in 1985 and 2000—the only two years when the individual 본관’s regional data is available) showing the list of populations of 본관 for each administrative regions of South Korea, provided by Korean Statistical Information Service (KOSIS): http://kosis.kr/

“ergodic” vs “non-ergodic” 본관 ...

Page 11: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

족보 in Korean culturefamily tree of (mostly) male descendants

spouses’ 본관 and birth year: “by-product” data sets

Page 12: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

족보 in Korean culturefamily tree of (mostly) male descendants

spouses’ 본관 and birth year: “by-product” data sets

10

10-9

10-6

10-3

100

100 102 104 106 108

P(k

)

k

1900-19901600-1630

Census

Figure 6. Comparison between the actual family books (points) and RGFpredictions (lines).

1

1.1

1.2

1.3

102 104 106 108

γ

M

Figure 7. Power-law exponent as a function of M . The solid line was obtainedby analyzing the family books from 1900–1990, and the dotted line is connectedwith the census data in 2000.

Mtot = 165 020 women each having one family name out of Ntot = 194 and where kmax = 32 316are named Kim. These three numbers Mtot, Ntot and kmax uniquely determine PM(k) within theRGF model, as explained in section 3 [6]. The middle full curve in figure 6 gives the predictedsize distribution and the pluses denote the actual (binned) data points. The agreement betweenthe RGF prediction and the data is very good, in particular in view of the fact that the predictionis based solely on the three numbers Mtot, Ntot and kmax. The prediction for the exponent � in(1) is � = 1.12. As explained in section 3, the RGF model allows you to predict how the PM(k)for either a smaller m < M or larger m > M . Figure 7 displays the predicted change of � whenstarting from the data given for the period 1900–1990. The solid curve gives the change whenm is decreasing and the dotted line when it is increasing. The fact that � changes with the sizeof the data set is a fundamental feature of the RGF model and distinguishes it from the usualgrowth models, which in general give scale-invariant and hence size-independent � [6]. The leftfull curve in figure 6 is the prediction for the 1600–1630 data only using the data for 1900–1990and the number m = 384, which is the number of women getting married into the ten familiesduring the period 1600–1630. The actual name-frequency distribution for these women is givenby the crosses and the agreement is again quite good. In particular, note that the data are indeedconsistent with the slightly steeper slope for smaller k caused by a slightly larger � = 1.22(compare figure 7). The rightmost curve in figure 6, in the same way, gives the prediction basedon only using the data for the married women in 1900–1990 and the total population size in theyear 2000 given by m = 4.6 ⇥ 107. The census data from the year 2000 are also plotted and theagreement between the prediction and the data is again very good. This time the � = 1.07 is

New Journal of Physics 13 (2011) 073036 (http://www.njp.org/)

S. K. Baek (백승기), P. Minnhagen, and B. J. Kim (김범준), New J. Phys. 13, 073036 (2011) and references therein.

Page 13: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

족보 in Korean culturefamily tree of (mostly) male descendants

spouses’ 본관 and birth year: “by-product” data sets

10

10-9

10-6

10-3

100

100 102 104 106 108

P(k

)

k

1900-19901600-1630

Census

Figure 6. Comparison between the actual family books (points) and RGFpredictions (lines).

1

1.1

1.2

1.3

102 104 106 108

γ

M

Figure 7. Power-law exponent as a function of M . The solid line was obtainedby analyzing the family books from 1900–1990, and the dotted line is connectedwith the census data in 2000.

Mtot = 165 020 women each having one family name out of Ntot = 194 and where kmax = 32 316are named Kim. These three numbers Mtot, Ntot and kmax uniquely determine PM(k) within theRGF model, as explained in section 3 [6]. The middle full curve in figure 6 gives the predictedsize distribution and the pluses denote the actual (binned) data points. The agreement betweenthe RGF prediction and the data is very good, in particular in view of the fact that the predictionis based solely on the three numbers Mtot, Ntot and kmax. The prediction for the exponent � in(1) is � = 1.12. As explained in section 3, the RGF model allows you to predict how the PM(k)for either a smaller m < M or larger m > M . Figure 7 displays the predicted change of � whenstarting from the data given for the period 1900–1990. The solid curve gives the change whenm is decreasing and the dotted line when it is increasing. The fact that � changes with the sizeof the data set is a fundamental feature of the RGF model and distinguishes it from the usualgrowth models, which in general give scale-invariant and hence size-independent � [6]. The leftfull curve in figure 6 is the prediction for the 1600–1630 data only using the data for 1900–1990and the number m = 384, which is the number of women getting married into the ten familiesduring the period 1600–1630. The actual name-frequency distribution for these women is givenby the crosses and the agreement is again quite good. In particular, note that the data are indeedconsistent with the slightly steeper slope for smaller k caused by a slightly larger � = 1.22(compare figure 7). The rightmost curve in figure 6, in the same way, gives the prediction basedon only using the data for the married women in 1900–1990 and the total population size in theyear 2000 given by m = 4.6 ⇥ 107. The census data from the year 2000 are also plotted and theagreement between the prediction and the data is again very good. This time the � = 1.07 is

New Journal of Physics 13 (2011) 073036 (http://www.njp.org/)

S. K. Baek (백승기), P. Minnhagen, and B. J. Kim (김범준), New J. Phys. 13, 073036 (2011) and references therein.

geographical information from 본관: honestly, too valuable pieces of information to overlook (and just be satisfied with the distributions as if we don’t know anything about them), don’t you think? :)

Page 14: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Marriage record in 족보 as a proxy of human migration due to marriage

Page 15: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Marriage record in 족보 as a proxy of human migration due to marriage

Page 16: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Studies on human migration patterns: two types of generative models

where mi is the population at i and rij is the distance between i and j

J. Q. Stewart and W. Warntz, J. Regional Sci. 1, 99 (1958);W.-S. Jung (정우성), F. Wang, and H. E. Stanley, EPL 81, 48005 (2008).

(i ! j) where Ri is the total population outgoing from i,sij is the population in the circle of radius rij centered at i(excluding the source i and destination j’s population),and N is the total population

F. Simini, M. C. González, A. Maritan, and A.-L. Barabási, Nature 484, 96 (2012); A. P. Masucci, J. Serras, A. Johansson, and M. Batty, Phys. Rev. E 88, 022812 (2013).

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ijcontrolling the effect of geography (or lack thereof)gravity model flux

radiation model flux

Page 17: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

In practice . . .

이(李) 진주 12636이(李) 진해 490이(李) 차성 2173이(李) 창녕 4132이(李) 창원 719이(李) 천안 2297이(李) 철성 4871이(李) 청도 983이(李) 청송 814이(李) 청안 13549이(李) 청양 743이(李) 청주 34756이(李) 청평 656이(李) 청해 12002이(李) 춘천 372이(李) 충남 995이(李) 충주 4022이(李) 칠성 1209이(李) 태안 4084이(李) 태원 670이(李) 토산 491이(李) 통진 227이(李) 평산 3394이(李) 평양 579이(李) 평창 65945이(李) 평택 1099이(李) 평해 205이(李) 풍천 731이(李) 하동 1052이(李) 하빈 15058이(李) 하산 828이(李) 하음 431이(李) 학성 20964이(李) 한산 136615이(李) 한성 1342이(李) 한양 1096이(李) 함경 792이(李) 함안 37597이(李) 함양 1851이(李) 함평 125419이(李) 함풍 6413이(李) 함흥 786이(李) 합천 115462이(李) 해남 878이(李) 해령 650이(李) 해주 2064이(李) 행주 870이(李) 헌양 786이(李) 홍성 1163이(李) 홍주 14897이(李) 홍천 566이(李) 화산 1775이(李) 화평 1498이(李) 황주 291

the census data (in 1985 and 2000) showing the list of populations of 본관 for each administrative regions of South Korea, provided by Korean Statistical Information Service (KOSIS): http://kosis.kr/

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

Page 18: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

In practice . . .

이(李) 진주 12636이(李) 진해 490이(李) 차성 2173이(李) 창녕 4132이(李) 창원 719이(李) 천안 2297이(李) 철성 4871이(李) 청도 983이(李) 청송 814이(李) 청안 13549이(李) 청양 743이(李) 청주 34756이(李) 청평 656이(李) 청해 12002이(李) 춘천 372이(李) 충남 995이(李) 충주 4022이(李) 칠성 1209이(李) 태안 4084이(李) 태원 670이(李) 토산 491이(李) 통진 227이(李) 평산 3394이(李) 평양 579이(李) 평창 65945이(李) 평택 1099이(李) 평해 205이(李) 풍천 731이(李) 하동 1052이(李) 하빈 15058이(李) 하산 828이(李) 하음 431이(李) 학성 20964이(李) 한산 136615이(李) 한성 1342이(李) 한양 1096이(李) 함경 792이(李) 함안 37597이(李) 함양 1851이(李) 함평 125419이(李) 함풍 6413이(李) 함흥 786이(李) 합천 115462이(李) 해남 878이(李) 해령 650이(李) 해주 2064이(李) 행주 870이(李) 헌양 786이(李) 홍성 1163이(李) 홍주 14897이(李) 홍천 566이(李) 화산 1775이(李) 화평 1498이(李) 황주 291

the census data (in 1985 and 2000) showing the list of populations of 본관 for each administrative regions of South Korea, provided by Korean Statistical Information Service (KOSIS): http://kosis.kr/

population from 2000 census data

marriage “flux” in 족보

distance between two population centroids in 2000

brides’ 본관 index i

grooms’ 본관 index j: identical (constant) for a single 족보 by definition

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

Page 19: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

Gravity & radiation models

linear fit: marriage flux Ni / Gij

linear fit: marriage flux Ni / Rij

↵ ⇡ 1.0749 and � ⇡ �0.0349! ↵ = 1 and � = 0 (to include the case i = j)

grooms’ 본관 index j: identical (constant) for a single 족보 by definition

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

Page 20: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

No dependency on the distance and ? “population-product model” (the 본관 of the 족보 is ergodic, so the husbands could be almost everywhere in the country, which made the geographic factors irrelevant)

/ m↵i

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

Gravity & radiation models

linear fit: marriage flux Ni / Gij

linear fit: marriage flux Ni / Rij

↵ ⇡ 1.0749 and � ⇡ �0.0349! ↵ = 1 and � = 0 (to include the case i = j)

grooms’ 본관 index j: identical (constant) for a single 족보 by definition

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

Page 21: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

No dependency on the distance and ? “population-product model” (the 본관 of the 족보 is ergodic, so the husbands could be almost everywhere in the country, which made the geographic factors irrelevant)

/ m↵i

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

Gravity & radiation models

linear fit: marriage flux Ni / Gij

linear fit: marriage flux Ni / Rij

↵ ⇡ 1.0749 and � ⇡ �0.0349! ↵ = 1 and � = 0 (to include the case i = j)

grooms’ 본관 index j: identical (constant) for a single 족보 by definition

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

outlier: women from the 본관 of 족보 are severely underrepresented. In fact, marrying a spouse from the same 본관 had been considered as a taboo in Korean culture and even illegal in South Korea until 2005!

Page 22: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

No dependency on the distance and ? “population-product model” (the 본관 of the 족보 is ergodic, so the husbands could be almost everywhere in the country, which made the geographic factors irrelevant)

/ m↵i

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

Gravity & radiation models

linear fit: marriage flux Ni / Gij

linear fit: marriage flux Ni / Rij

↵ ⇡ 1.0749 and � ⇡ �0.0349! ↵ = 1 and � = 0 (to include the case i = j)

grooms’ 본관 index j: identical (constant) for a single 족보 by definition

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

outlier: women from the 본관 of 족보 are severely underrepresented. In fact, marrying a spouse from the same 본관 had been considered as a taboo in Korean culture and even illegal in South Korea until 2005!

a simple measure of ergodicity: number of different regions occupied by 본관

“ergodic” vs “non-ergodic” 본관

Page 23: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

“ergodic” vs “non-ergodic” 본관

6

0 2000

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(cl

an

s) (a)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(cl

an

s) (c)

0 2000

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(b)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(d)

FIG. 3. Distribution of the number of di↵erent administrative regionsoccupied by clans. (a) Probability distribution of the number of dif-ferent administrative regions occupied by a Korean clan in the year2000. (b) Probability distribution of the number of di↵erent admin-istrative regions occupied by the clan of a Korean individual selecteduniformly at random in the year 2000. The di↵erence between thispanel and the previous one arises from the fact that clans with largerpopulations tend to occupy more administrative regions. Note thatthe rightmost bar has a height of 0.17 but has been truncated for vi-sual presentation. (c) Probability distribution of radii of gyration (inkm) for clans in 2000. (d) Probability distribution of radii of gyra-tion (in km) for clans of a Korean individual selected uniformly atrandom in 2000. The di↵erence between this panel and the previ-ous one arises from the fact that clans with larger populations tendto occupy more administrative regions. Solid curves are kernel den-sity estimates (from Matlab R2011a’s ksdensity function with aGaussian smoothing kernel of width 5).

have jokbo are fairly ergodic, so the variables associated withthe j indices (i.e., the grooms) in Eqs. (1) and (2) have alreadylost much of their geographical precision, which is consistentboth with the values ↵ = 0 and � = 0 (the population prod-uct model). Again see the scatter plots in Fig. 2, in which wecolor each clan according to the number of di↵erent admin-istrative regions that it occupies. Note that the three di↵erentergodicity diagnostics are only weakly correlated (see Fig. I2).

Our observations of clan bimodality for Korea contrastsharply with our observations for family names in theCzech republic, where most family names appear to be non-ergodic [25] (see Fig. I3). One possible explanation of theubiquity of ergodic Korean names is the historical fact thatmany families from the lower social classes adopted (or evenpurchased) names of noble clans from the upper classes nearthe end of the Joseon dynasty (19th–20th centuries) [20, 52].At the time, Korean society was very unstable, and this pro-cess might have, in essence, introduced a preferential growthof ergodic names.

In Fig. 4, we show the distribution of the di↵usion constants

−5 200

0.7

diffusion constant (km2/year)

pro

ba

bili

ty d

istr

ibu

tion

FIG. 4. Distribution of estimated di↵usion constants (in km2/year)computed using 1985 and 2000 census data and Eq. (3). Thesolid curve is a kernel density estimate (from Matlab R2011a’sksdensity function with default smoothing). See the Appendixfor details of the calculation of di↵usion constants.

that we computed by fitting to Eq. (3). Some of the values arenegative, which presumably arises from finite-size e↵ects inergodic clans as well as basic limitations in estimating di↵u-sion constants using only a pair of nearby years. In Fig. I4,we show the correlations between the di↵usion constants andother measures.

C. Convection in Addition to Di↵usion as Another Mechanismfor Migration

The assumption that human populations simply di↵use is agross oversimplification of reality. We will thus consider theintriguing (but still grossly oversimplified) possibility of si-multaneous di↵usive and convective (bulk) transport. In thepast century, a dramatic movement from rural to urban areashas caused Seoul’s population to increase by a factor of morethan 50, tremendously outpacing Korea’s population growthas a whole [53]. This suggests the presence of a strong at-tractor or “sink” for the bulk flow of population into Seoul, ashas been discussed in rural-urban labor migration studies [54].The density-equalizing population cartogram [55] in Fig. I5clearly demonstrates the rapid growth of Seoul and its sur-roundings between 1970 and 2010.

If convection (i.e., bulk flow) directed towards Seoul hasindeed occurred throughout Korea while clans were simulta-neously di↵using from their points of origin, then one oughtto be able to detect a signature of such a flow. In Fig. 5(a),we show what we believe is such a signature: we observethat the fraction of ergodic clans increases with the distancebetween Seoul and a clan’s place of origin. This would be un-expected for a purely di↵usive system or, indeed, in any othersimple model that excludes convective transport. By allow-ing for bulk flow, we expect to observe that a clan’s mem-bers preferentially occupy territory in the flow path that islocated geographically between the clan’s starting point andSeoul. For clans that start closer to Seoul, this path is short;for those that start farther away, the longer flow path ought

Distribution of administrative regions occupied by 본관

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

ergodic

non-ergodic

Page 24: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

“ergodic” vs “non-ergodic” 본관

6

0 2000

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(cl

an

s) (a)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(cl

an

s) (c)

0 2000

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(b)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(d)

FIG. 3. Distribution of the number of di↵erent administrative regionsoccupied by clans. (a) Probability distribution of the number of dif-ferent administrative regions occupied by a Korean clan in the year2000. (b) Probability distribution of the number of di↵erent admin-istrative regions occupied by the clan of a Korean individual selecteduniformly at random in the year 2000. The di↵erence between thispanel and the previous one arises from the fact that clans with largerpopulations tend to occupy more administrative regions. Note thatthe rightmost bar has a height of 0.17 but has been truncated for vi-sual presentation. (c) Probability distribution of radii of gyration (inkm) for clans in 2000. (d) Probability distribution of radii of gyra-tion (in km) for clans of a Korean individual selected uniformly atrandom in 2000. The di↵erence between this panel and the previ-ous one arises from the fact that clans with larger populations tendto occupy more administrative regions. Solid curves are kernel den-sity estimates (from Matlab R2011a’s ksdensity function with aGaussian smoothing kernel of width 5).

have jokbo are fairly ergodic, so the variables associated withthe j indices (i.e., the grooms) in Eqs. (1) and (2) have alreadylost much of their geographical precision, which is consistentboth with the values ↵ = 0 and � = 0 (the population prod-uct model). Again see the scatter plots in Fig. 2, in which wecolor each clan according to the number of di↵erent admin-istrative regions that it occupies. Note that the three di↵erentergodicity diagnostics are only weakly correlated (see Fig. I2).

Our observations of clan bimodality for Korea contrastsharply with our observations for family names in theCzech republic, where most family names appear to be non-ergodic [25] (see Fig. I3). One possible explanation of theubiquity of ergodic Korean names is the historical fact thatmany families from the lower social classes adopted (or evenpurchased) names of noble clans from the upper classes nearthe end of the Joseon dynasty (19th–20th centuries) [20, 52].At the time, Korean society was very unstable, and this pro-cess might have, in essence, introduced a preferential growthof ergodic names.

In Fig. 4, we show the distribution of the di↵usion constants

−5 200

0.7

diffusion constant (km2/year)

pro

ba

bili

ty d

istr

ibu

tion

FIG. 4. Distribution of estimated di↵usion constants (in km2/year)computed using 1985 and 2000 census data and Eq. (3). Thesolid curve is a kernel density estimate (from Matlab R2011a’sksdensity function with default smoothing). See the Appendixfor details of the calculation of di↵usion constants.

that we computed by fitting to Eq. (3). Some of the values arenegative, which presumably arises from finite-size e↵ects inergodic clans as well as basic limitations in estimating di↵u-sion constants using only a pair of nearby years. In Fig. I4,we show the correlations between the di↵usion constants andother measures.

C. Convection in Addition to Di↵usion as Another Mechanismfor Migration

The assumption that human populations simply di↵use is agross oversimplification of reality. We will thus consider theintriguing (but still grossly oversimplified) possibility of si-multaneous di↵usive and convective (bulk) transport. In thepast century, a dramatic movement from rural to urban areashas caused Seoul’s population to increase by a factor of morethan 50, tremendously outpacing Korea’s population growthas a whole [53]. This suggests the presence of a strong at-tractor or “sink” for the bulk flow of population into Seoul, ashas been discussed in rural-urban labor migration studies [54].The density-equalizing population cartogram [55] in Fig. I5clearly demonstrates the rapid growth of Seoul and its sur-roundings between 1970 and 2010.

If convection (i.e., bulk flow) directed towards Seoul hasindeed occurred throughout Korea while clans were simulta-neously di↵using from their points of origin, then one oughtto be able to detect a signature of such a flow. In Fig. 5(a),we show what we believe is such a signature: we observethat the fraction of ergodic clans increases with the distancebetween Seoul and a clan’s place of origin. This would be un-expected for a purely di↵usive system or, indeed, in any othersimple model that excludes convective transport. By allow-ing for bulk flow, we expect to observe that a clan’s mem-bers preferentially occupy territory in the flow path that islocated geographically between the clan’s starting point andSeoul. For clans that start closer to Seoul, this path is short;for those that start farther away, the longer flow path ought

Distribution of administrative regions occupied by 본관

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

ergodic

non-ergodic

19

Surnames, Geoforum 42, 506 (2011).[25] J. Novotný and J. A. Cheshire, The Surname Space of the Czech

Republic: Examining Population Structure by Network Anal-ysis of Spatial Co-Occurrence of Surnames, PLOS ONE 7,e48568 (2012).

[26] An original jokbo usually includes more details about clans, butwe only use information that we mention explicitly.

[27] Google Maps, https://developers.google.com/maps/.

[28] In fact, it is known that clans that originated in the southernpart of Korea have been more abundant in recent Korean history(including the period that is spanned by our data set) [20].

[29] Korean Statistical Information Service, http://kosis.kr/ (Korean version) and http://kosis.kr/eng/ (En-glish version).

[30] Statistical Geographic Information Service (:üx>⇢t�o�&Ò⌦ò–"fq�€º in Korean), http://sgis.kostat.go.kr/ (Ko-rean version; no English version available).

[31] J. P. Sethna, Statistical Mechanics: Entropy, Order Parame-

0 2000

500

number of regions occupied

dis

tance

move

d (

km) (a)

0 2500

500

radius of gyration (km)

dis

tance

move

d (

km) (b)

0 2000

250

number of regions occupied

rad. of gyr

atio

n (

km) (c)

FIG. I2. Scatter plots of (a) the distance between the clan-originlocation and the population centroid versus the number of admin-istrative regions, (b) the distance between the clan-origin locationand the population centroid versus the radius of gyration, and (c)the radius of gyration versus the number of administrative regions.The corresponding Pearson correlation values are (a) r ⇡ 0.20 (from3 481 clans that include all of the required information; the p-valueis p ⇡ 5.7 ⇥ 10�34), (b) r ⇡ 0.066 (from 3 481 clans that includeall of the required information; the p-value p ⇡ 9.4 ⇥ 10�5), and (c)r ⇡ �0.26 (from 3 481 clans; the p-value is p ⇡ 1.8⇥10�53). Note thatcorrelations over limited ranges may be di↵erent and significant: forexample, in panel (c), the two metrics for ergodicity are significantlypositively correlated when rg < 50 km (r ⇡ 0.39, p ⇡ 2.3 ⇥ 10�4).

ters and Complexity (Oxford University Press, Oxford, UnitedKingdom, 2006)

[32] J. van Lith, Ergodic Theory, Interpretations of Probability andthe Foundations of Statistical Mechanics, Stud. Hist. Philos.Sci. 32, 581 (2001).

[33] Because we only have ten family books, our data allows onlyten distinct possibilities for “clan j” (the family into which awoman marries).

[34] S. Cho, Chapter 8. Traditional Korean Culture in Cultural andEthnic Diversity: A Guide for Genetics Professionals eds FisherNL (The Johns Hopkins University Press, Maryland, UnitedStates, 1996).

[35] F. Simini, M. C. González, A. Maritan, and A.-L. Barabási, AUniversal Model for Mobility and Migration Patterns, Nature(London) 484, 96 (2012).

[36] A. P. Masucci, J. Serras, A. Johansson, and M. Batty, Grav-ity versus Radiation Models: On the Importance of Scale andHeterogeneity in Commuting Flows, Phys. Rev. E 88, 022812

0 2060

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(cl

an

s) (a)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(cl

an

s) (c)

0 2060

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(b)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(d)

FIG. I3. (a) Probability distribution of the number of di↵erent admin-istrative regions occupied a Czech family name in 2009. Note thatthe leftmost two bars have heights of 0.17 and 0.03 but have beentruncated for visual presentation. (This data was initially analysedin Ref. [25].) (b) Probability distribution of the number of di↵erentadministrative regions occupied by the clan of a Czech individualselected uniformly at random in 2009. The di↵erence between thispanel and the previous one arises from the fact that clans with largerpopulations tend to occupy more administrative regions (c) Probabil-ity distribution of radii of gyration (in km) of Czech family namesin 2009. Note that the leftmost bar has a height of 0.11 but has beentruncated for visual presentation. (d) Probability distribution of radiiof gyration (in km) of Czech family names of a Czech individualselected uniformly at random in 2009. The di↵erence between thispanel and the previous one arises from the fact that clans with largerpopulations tend to occupy more administrative regions. Observethat the distributions in panels (a) and (b) are starkly di↵erent fromthe distributions in panels (a) and (b) from Fig. 3. Solid curves arekernel density estimates (from Matlab R2011a’s ksdensity func-tion with a Gaussian smoothing kernel of width 5).

surnames in Czech Republic

non-ergodic

data from J. Novotný and J. A. Cheshire, PLOS ONE 7, e48568 (2012).

then interpreted as a result of the subsequent resettlement andindustrialization led immigration into these areas, but also of somestate policies that have contributed to the spatial concentrations(and often also segregations) of Roma minority groups [23].

An intriguing exception to these explanations is a typical Czechsurname Vlcek that can also be found in Figure 6 because of itssignificant revealed relatedness with Wolf. From all of thesurnames considered, the name Wolf has been found as thenearest neighbour of Vlcek, with the 56.7% probability that one ofthese surnames concentrates in the region where another one isconcentrated. The high co-occurrence of these two surnames inthe identical regions seems to be attributable to their commonmeaning – Vlcek literally means ‘‘small Wolf’’ in the Czechlanguage. The bi-lingual naming practices or secular nametransformations taking place in these historically multi-ethnicregions (German and Czech) offers the most likely explanation forsuch commonalities.

Analysis of co-occurrence in municipalitiesIn the second stage of our analysis we examined the co-

occurrence of Czech surnames at the finest spatial level of 6,244municipalities. We began with the calculation of the pairwiseindices of revealed relatedness (Di,j,mun) among 5,660 surnamesselected on the basis of the highest revealed relatedness at moreaggregate spatial level. This sample of surnames covers almost ahalf of the Czech male population. Given the significantly highernumber of spatial units considered for this second stage of ouranalysis, the values of Di,j,mun are generally lower than Di,j,reg in thefirst stage which focused on co-occurrence in 206 micro-regionsonly. At the same time, the size distribution of these second stageresults is even more skewed to the right; the maximum Di,j,mun

(from the total of more than 32 million of observations)corresponds to 0.687, while only 0.011% of all observationsexceed 50% of the maximum value. These differences between thefirst and second stage results are understandable and go hand inhand with the expectation that the surname network based on themunicipality level calculations will be more fragmented.

This has been confirmed by the fact that a majority of the mostsignificant Di,j,mun proximity observations occur among relatively

rare surnames that are typically concentrated in a few nearbymunicipalities. This is especially the case of Silesian surnames thataccount for almost all Di,j,mun observations at the very top of thedistribution of results. As such, in order to get a reasonablenetwork representation, we again had to impose some restrictionsin relation to the minimal size of surnames shown as nodes and thestrength of links between them. After applying the criteria from theprevious section, we found the frequency of at least 150 bearersand the links determined by Di,j,mun .0.23 to be optimal. Thesurname network based on these parameters and generated by aweighted force-directed algorithm is depicted in Figure 7 (Fig-ure S5 depicts a high resolution version with labels of individualsurnames and the size of nodes scaled by their population size).

In general, the second stage or municipality level surnamenetwork has reproduced the macro-division of the Czech surnamespace identified in the first stage and described above. Theproportions between the sizes of the main clusters are howeverdifferent with the previously mentioned dominance of the densegroup of Silesian surnames (B2). Regarding Moravian surnames,again the commonality of verb-derived surnames emerges, as theyform the majority of names in the B1 area of the network. TheBohemian part of the surname space (A) is structured into threemain groups of surnames. The A1 cluster comprises some of themost frequent surnames and those prevalent across most ofBohemian regions, whilst the separation from the secondarycluster (A2) is hardly discernible. By contrast, two other core areasare well recognizable and represent northern and easternBohemian names (A3) more specifically and surnames concen-trated mainly in municipalities in the north-west and west ofBohemia (A4).

The general congruence in macro-structure of the surnamenetworks constructed here and in the first stage of our analysis isan important finding (generally similar macro-structure was alsofound when the Ji,j,reg and Ji,j,mun were considered instead of theDi,j,reg and Di,j,mun, respectively). However, the main value of thissecond stage municipality level exercise should be seen inindividual details uncovered with respect to local parts of thesurname network. A number of interesting examples of pairs ofsurnames that have been found as potentially closely related,

Figure 4. Spatial concentration of individual communities of high degree surnames. Individual maps show regional variation in thepercentage of high degree surnames from particular core communities (A1, A2, A3, A4, B1, B2) as listed in Table 5 concentrated in a given region. Forexample, if the percentage of high degree surnames for A1 (the upper left map) corresponds to 100, then all the surnames listed in the A1 group inTable 5 are concentrated in a given region (that is, all of them satisfy LQi,r .1 for the region in question).doi:10.1371/journal.pone.0048568.g004

The Surname Space of the Czech Republic

PLOS ONE | www.plosone.org 8 October 2012 | Volume 7 | Issue 10 | e48568

Page 25: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

More sophisticated measure of ergodicity & simple diffusion dynamics

“clan density anomaly”:

�i(rk, t) = ci(rk, t)�mi(t)

N(t)⇢(rk, t) ,

where rk is the two-dimensional coordinate vector of region k,ci(rk, t) is clan i’s population density in region k at time t,mi(t) is the total population of clan i in the country at time t,N(t) is the total population (all clans) in the country at time t,and ⇢(rk, t) is the total population density (all clans) in region k at time t.

본관

본관

본관

본관

본관

Page 26: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

More sophisticated measure of ergodicity & simple diffusion dynamics

“clan density anomaly”:

�i(rk, t) = ci(rk, t)�mi(t)

N(t)⇢(rk, t) ,

where rk is the two-dimensional coordinate vector of region k,ci(rk, t) is clan i’s population density in region k at time t,mi(t) is the total population of clan i in the country at time t,N(t) is the total population (all clans) in the country at time t,and ⇢(rk, t) is the total population density (all clans) in region k at time t.

본관

본관

clan i’s radius of gyration of anomaly:

rgi =

s1

�i,tot(t)

X

k

[krk � ri,C(t)k2�i(rk, t)Ak] ,

where �i,tot(t) =P

k �i(rk, t)Ak,ri,C(t) = [1/�i,tot(t)]

Pk rk�i(rk, t)Ak,

and Ak is the area of region k.

본관

본관

본관

본관

Page 27: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

More sophisticated measure of ergodicity & simple diffusion dynamics

assuming a simple diffusive process over time: measuring the diffusion constants by numerically applying the partial differential equation of ϕi above at the two endpoints (“boundary conditions”) in time

assuming that the spatial variation of (mi/N)⇢ is much smaller than that of ci

a very simple di↵usion model described by Fick’s second law:

the flux of clan members

~Ji / ~r�i

(individuals move preferentially away from high concentrations of their family),

then

@ci@t

=

~r · ~Ji / r2�i

(assuming nor spatial variation in the constant of proportionality), or

@�i

@t= Dir2�i,

with the di↵usion constant Di with dimensions [length

2/time].

본관

Page 28: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

More sophisticated measure of ergodicity & simple diffusion dynamics

assuming a simple diffusive process over time: measuring the diffusion constants by numerically applying the partial differential equation of ϕi above at the two endpoints (“boundary conditions”) in time

assuming that the spatial variation of (mi/N)⇢ is much smaller than that of ci

a very simple di↵usion model described by Fick’s second law:

the flux of clan members

~Ji / ~r�i

(individuals move preferentially away from high concentrations of their family),

then

@ci@t

=

~r · ~Ji / r2�i

(assuming nor spatial variation in the constant of proportionality), or

@�i

@t= Dir2�i,

with the di↵usion constant Di with dimensions [length

2/time].

본관 density anomaly of Cho from Haman (함안 조)

3

FIG

.3.Example

ofclananomaly

picture

forclan1,1985vs.

2000,usingDanny’s

anomaly

defi

nition.Circleis

centeredat

CM

positionwithradiussetto

radiusofgyration.Heredi↵usivee↵

ects

are

visible.

FIG

.4.Example

ofclananomaly

picture

forKim

from

Gim

hae,

1985vs.

2000,usingDanny’s

anomaly

defi

nition.Circleis

centeredatCM

positionwithradiussetto

radiusofgyration.(N

ote

positionandsize

ofcircle

are

reasonable.)

large �

small �6

0 2000

0.02

number of regions occupied

pro

babili

ty d

ist. (

clans) (a)

0 2500

0.02

radius of gyration (km)

pro

babili

ty d

ist. (

clans) (c)

0 2000

0.02

number of regions occupied

pro

babili

ty d

ist. (

indiv

iduals

)

(b)

0 2500

0.02

radius of gyration (km)

pro

babili

ty d

ist. (

indiv

iduals

)

(d)

FIG. 3. Distribution of the number of di↵erent administrative regionsoccupied by clans. (a) Probability distribution of the number of dif-ferent administrative regions occupied by a Korean clan in the year2000. (b) Probability distribution of the number of di↵erent admin-istrative regions occupied by the clan of a Korean individual selecteduniformly at random in the year 2000. The di↵erence between thispanel and the previous one arises from the fact that clans with largerpopulations tend to occupy more administrative regions. Note thatthe rightmost bar has a height of 0.17 but has been truncated for vi-sual presentation. (c) Probability distribution of radii of gyration (inkm) for clans in 2000. (d) Probability distribution of radii of gyra-tion (in km) for clans of a Korean individual selected uniformly atrandom in 2000. The di↵erence between this panel and the previ-ous one arises from the fact that clans with larger populations tendto occupy more administrative regions. Solid curves are kernel den-sity estimates (from Matlab R2011a’s ksdensity function with aGaussian smoothing kernel of width 5).

have jokbo are fairly ergodic, so the variables associated withthe j indices (i.e., the grooms) in Eqs. (1) and (2) have alreadylost much of their geographical precision, which is consistentboth with the values ↵ = 0 and � = 0 (the population prod-uct model). Again see the scatter plots in Fig. 2, in which wecolor each clan according to the number of di↵erent admin-istrative regions that it occupies. Note that the three di↵erentergodicity diagnostics are only weakly correlated (see Fig. I2).

Our observations of clan bimodality for Korea contrastsharply with our observations for family names in theCzech republic, where most family names appear to be non-ergodic [25] (see Fig. I3). One possible explanation of theubiquity of ergodic Korean names is the historical fact thatmany families from the lower social classes adopted (or evenpurchased) names of noble clans from the upper classes nearthe end of the Joseon dynasty (19th–20th centuries) [20, 52].At the time, Korean society was very unstable, and this pro-cess might have, in essence, introduced a preferential growthof ergodic names.

In Fig. 4, we show the distribution of the di↵usion constants

−5 200

0.7

diffusion constant (km2/year)

pro

babili

ty d

istr

ibutio

n

FIG. 4. Distribution of estimated di↵usion constants (in km2/year)computed using 1985 and 2000 census data and Eq. (3). Thesolid curve is a kernel density estimate (from Matlab R2011a’sksdensity function with default smoothing). See the Appendixfor details of the calculation of di↵usion constants.

that we computed by fitting to Eq. (3). Some of the values arenegative, which presumably arises from finite-size e↵ects inergodic clans as well as basic limitations in estimating di↵u-sion constants using only a pair of nearby years. In Fig. I4,we show the correlations between the di↵usion constants andother measures.

C. Convection in Addition to Di↵usion as Another Mechanismfor Migration

The assumption that human populations simply di↵use is agross oversimplification of reality. We will thus consider theintriguing (but still grossly oversimplified) possibility of si-multaneous di↵usive and convective (bulk) transport. In thepast century, a dramatic movement from rural to urban areashas caused Seoul’s population to increase by a factor of morethan 50, tremendously outpacing Korea’s population growthas a whole [53]. This suggests the presence of a strong at-tractor or “sink” for the bulk flow of population into Seoul, ashas been discussed in rural-urban labor migration studies [54].The density-equalizing population cartogram [55] in Fig. I5clearly demonstrates the rapid growth of Seoul and its sur-roundings between 1970 and 2010.

If convection (i.e., bulk flow) directed towards Seoul hasindeed occurred throughout Korea while clans were simulta-neously di↵using from their points of origin, then one oughtto be able to detect a signature of such a flow. In Fig. 5(a),we show what we believe is such a signature: we observethat the fraction of ergodic clans increases with the distancebetween Seoul and a clan’s place of origin. This would be un-expected for a purely di↵usive system or, indeed, in any othersimple model that excludes convective transport. By allow-ing for bulk flow, we expect to observe that a clan’s mem-bers preferentially occupy territory in the flow path that islocated geographically between the clan’s starting point andSeoul. For clans that start closer to Seoul, this path is short;for those that start farther away, the longer flow path ought

distribution of estimated diffusion constants

본관

Page 29: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

7

distance from clan originlocation to Seoul (km)

fract

ion e

rgodic

(a)

0 3250.3

0.7

distance from clan origin locationto current clan centroid (km)

fract

ion e

rgodic

(b)

0 3250

0.7

FIG. 5. Fraction of ergodic clans and distance scales of clans. (a)Fraction of ergodic clans versus distance to Seoul. The correlationbetween the variables is positive and statistically significant. (ThePearson correlation coe�cient is r ⇡ 0.83, and the p-value is p ⇡0.0017.) For the purpose of this calculation, we call a clan “ergodic”if it is present in at least 150 administrative regions. We estimate thisfraction separately in each of 11 equally-sized bins over the displayedrange of distances. The gray regions give 95% confidence intervals.(b) Fraction of ergodic clans versus the distance between location ofclan origin and the present-day centroid. We measure ergodicity as inthe left panel, and we estimate the fraction separately for each rangeof binned distances. (We use the same bins as in the left panel.) Thecorrelation between the variables is significantly positive up to 150km (r ⇡ 0.94, p ⇡ 0.0098) and is significantly negative (r ⇡ �0.98,p ⇡ 2.4 ⇥ 10�4) for larger distances.

to contribute to an increased number of administrative regionsoccupied and hence to a greater aggregate ergodicity. We plotthe fraction of ergodic clans versus the distance a clan hasmoved (which we estimate by calculating distances betweenclan-origin locations and the corresponding modern clan cen-troids) in Fig. 5(b). This further supports our claim that bothconvective and di↵usive transport have occurred. In additionto the fraction of ergodic clans, we also compared the radii ofgyration rg to the distance to Seoul and to the distance betweenlocation of clan origin and the present-day centroid, as shownin Fig. I6, and the latter shows the same tendency as the frac-tion of clans, which further supports our claim. We speculatethat the absence of statistical significance of the correlationbetween rg and the distance to Seoul is caused by the sam-pling problem, where many small clans are not included be-cause their centroids cannot be determined (see Appendix B).

We assume that clans that have moved a greater distancehave also existed for a longer time and hence have undergonedi↵usion longer; we thus also expect such clans to be more er-godic. This is consistent with our observations in Fig. 5(b) fordistances less than about 150 km, but it is di�cult to use thesame logic to explain our observations for distances greaterthan 150 km. However, if one assumes that long-distancemoves are more likely to arise from convective e↵ects thanfrom di↵usive ones, then our observations for both short andlong distances become understandable: the fraction of movesfrom bulk-flow e↵ects like resettlement or transplantation islarger for long-distance moves, and they become increasingly

dominant as the distance approaches 325 km (roughly the sizeof the Korean peninsula). We speculate that the clans thatmoved farther than 150 km are likely to be ones that origi-nated in the most remote areas of Korea, or even outside ofKorea, and that they have only relatively recently been trans-planted to major Korean population centers, from which theyhave had little time to spread. This observation is necessarilyspeculative because the age of a clan is not easy to determine:the first entry in a jokbo (see Table A-I for our ten jokbo) couldhave resulted from the invention of characters or printing de-vices rather than from the true birth of a clan [20].

Ultimately, our data are insu�cient to definitively ac-cept or reject the hypothesis of human di↵usion. However,as our analysis demonstrates, our data are consistent withthe theory of simultaneous human “di↵usion” and “convec-tion.” Furthermore, our analysis suggests that if the hypoth-esis of pure di↵usion is correct, then our estimated di↵u-sion constants indicate a possible time scale for relaxation toa dynamic equilibrium and thus for mixing in human soci-eties. In mainland South Korea, it would take approximately(100000 km2)/(1.5 km2/year) ⇡ 67000 years for purely dif-fusive mixing to produce a well-mixed society. A convectiveprocess thus appears to be playing the important role of pro-moting human interaction by accelerating mixing in the pop-ulation. In spite of such limitations, we try to estimate andquantify the centrality of Seoul in the context of a popula-tion flow network model and find other distinctive characteris-tics of the population flow patterns of ergodic and non-ergodicclans. For details, see Appendix H.

V. CONCLUSIONS

The long history of detailed record-keeping in Korean cul-ture provides an unusual opportunity for quantitative researchon historical human mobility and migration, and our inves-tigation strongly suggests that both “di↵usive” and “convec-tive” patterns have played important roles in establishing thecurrent distribution of clans in Korea.

By studying the geographical locations of clan origins injokbo (Korean family books), we have quantified the extent of“ergodicity” of Korean clans as reflected in time series of mar-riage snapshots. This illustrates the need to investigate the dis-tribution of individual clans in more details. Additionally, bycomparing our results from Korean clans to those from Czechfamilies, we have also demonstrated that these ideas can giveinsightful indications of di↵erent mobility and migration pat-terns in di↵erent cultures. Our ergodicity analysis using mod-ern census data clearly illustrates that there are both ergodicand non-ergodic clans, and we have used these results to sug-gest two di↵erent mechanisms for human migration on longtime scales. Many processes involve the attractiveness of thecenter versus di↵usion away from it, so this type of modelingframework can be applied to a lot of data sets, we believe.

A noteworthy feature of our analysis is that we used bothdata with high temporal resolution but low spatial resolution(jokbo data) and data with high spatial resolution but low tem-poral resolution (census data). This allowed us to consider

“diffusion” vs “convection” of human migration?

본관본관

Page 30: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

7

distance from clan originlocation to Seoul (km)

fract

ion e

rgodic

(a)

0 3250.3

0.7

distance from clan origin locationto current clan centroid (km)

fract

ion e

rgodic

(b)

0 3250

0.7

FIG. 5. Fraction of ergodic clans and distance scales of clans. (a)Fraction of ergodic clans versus distance to Seoul. The correlationbetween the variables is positive and statistically significant. (ThePearson correlation coe�cient is r ⇡ 0.83, and the p-value is p ⇡0.0017.) For the purpose of this calculation, we call a clan “ergodic”if it is present in at least 150 administrative regions. We estimate thisfraction separately in each of 11 equally-sized bins over the displayedrange of distances. The gray regions give 95% confidence intervals.(b) Fraction of ergodic clans versus the distance between location ofclan origin and the present-day centroid. We measure ergodicity as inthe left panel, and we estimate the fraction separately for each rangeof binned distances. (We use the same bins as in the left panel.) Thecorrelation between the variables is significantly positive up to 150km (r ⇡ 0.94, p ⇡ 0.0098) and is significantly negative (r ⇡ �0.98,p ⇡ 2.4 ⇥ 10�4) for larger distances.

to contribute to an increased number of administrative regionsoccupied and hence to a greater aggregate ergodicity. We plotthe fraction of ergodic clans versus the distance a clan hasmoved (which we estimate by calculating distances betweenclan-origin locations and the corresponding modern clan cen-troids) in Fig. 5(b). This further supports our claim that bothconvective and di↵usive transport have occurred. In additionto the fraction of ergodic clans, we also compared the radii ofgyration rg to the distance to Seoul and to the distance betweenlocation of clan origin and the present-day centroid, as shownin Fig. I6, and the latter shows the same tendency as the frac-tion of clans, which further supports our claim. We speculatethat the absence of statistical significance of the correlationbetween rg and the distance to Seoul is caused by the sam-pling problem, where many small clans are not included be-cause their centroids cannot be determined (see Appendix B).

We assume that clans that have moved a greater distancehave also existed for a longer time and hence have undergonedi↵usion longer; we thus also expect such clans to be more er-godic. This is consistent with our observations in Fig. 5(b) fordistances less than about 150 km, but it is di�cult to use thesame logic to explain our observations for distances greaterthan 150 km. However, if one assumes that long-distancemoves are more likely to arise from convective e↵ects thanfrom di↵usive ones, then our observations for both short andlong distances become understandable: the fraction of movesfrom bulk-flow e↵ects like resettlement or transplantation islarger for long-distance moves, and they become increasingly

dominant as the distance approaches 325 km (roughly the sizeof the Korean peninsula). We speculate that the clans thatmoved farther than 150 km are likely to be ones that origi-nated in the most remote areas of Korea, or even outside ofKorea, and that they have only relatively recently been trans-planted to major Korean population centers, from which theyhave had little time to spread. This observation is necessarilyspeculative because the age of a clan is not easy to determine:the first entry in a jokbo (see Table A-I for our ten jokbo) couldhave resulted from the invention of characters or printing de-vices rather than from the true birth of a clan [20].

Ultimately, our data are insu�cient to definitively ac-cept or reject the hypothesis of human di↵usion. However,as our analysis demonstrates, our data are consistent withthe theory of simultaneous human “di↵usion” and “convec-tion.” Furthermore, our analysis suggests that if the hypoth-esis of pure di↵usion is correct, then our estimated di↵u-sion constants indicate a possible time scale for relaxation toa dynamic equilibrium and thus for mixing in human soci-eties. In mainland South Korea, it would take approximately(100000 km2)/(1.5 km2/year) ⇡ 67000 years for purely dif-fusive mixing to produce a well-mixed society. A convectiveprocess thus appears to be playing the important role of pro-moting human interaction by accelerating mixing in the pop-ulation. In spite of such limitations, we try to estimate andquantify the centrality of Seoul in the context of a popula-tion flow network model and find other distinctive characteris-tics of the population flow patterns of ergodic and non-ergodicclans. For details, see Appendix H.

V. CONCLUSIONS

The long history of detailed record-keeping in Korean cul-ture provides an unusual opportunity for quantitative researchon historical human mobility and migration, and our inves-tigation strongly suggests that both “di↵usive” and “convec-tive” patterns have played important roles in establishing thecurrent distribution of clans in Korea.

By studying the geographical locations of clan origins injokbo (Korean family books), we have quantified the extent of“ergodicity” of Korean clans as reflected in time series of mar-riage snapshots. This illustrates the need to investigate the dis-tribution of individual clans in more details. Additionally, bycomparing our results from Korean clans to those from Czechfamilies, we have also demonstrated that these ideas can giveinsightful indications of di↵erent mobility and migration pat-terns in di↵erent cultures. Our ergodicity analysis using mod-ern census data clearly illustrates that there are both ergodicand non-ergodic clans, and we have used these results to sug-gest two di↵erent mechanisms for human migration on longtime scales. Many processes involve the attractiveness of thecenter versus di↵usion away from it, so this type of modelingframework can be applied to a lot of data sets, we believe.

A noteworthy feature of our analysis is that we used bothdata with high temporal resolution but low spatial resolution(jokbo data) and data with high spatial resolution but low tem-poral resolution (census data). This allowed us to consider

“diffusion” vs “convection” of human migration?

diffusion part?

본관본관

Page 31: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

7

distance from clan originlocation to Seoul (km)

fract

ion e

rgodic

(a)

0 3250.3

0.7

distance from clan origin locationto current clan centroid (km)

fract

ion e

rgodic

(b)

0 3250

0.7

FIG. 5. Fraction of ergodic clans and distance scales of clans. (a)Fraction of ergodic clans versus distance to Seoul. The correlationbetween the variables is positive and statistically significant. (ThePearson correlation coe�cient is r ⇡ 0.83, and the p-value is p ⇡0.0017.) For the purpose of this calculation, we call a clan “ergodic”if it is present in at least 150 administrative regions. We estimate thisfraction separately in each of 11 equally-sized bins over the displayedrange of distances. The gray regions give 95% confidence intervals.(b) Fraction of ergodic clans versus the distance between location ofclan origin and the present-day centroid. We measure ergodicity as inthe left panel, and we estimate the fraction separately for each rangeof binned distances. (We use the same bins as in the left panel.) Thecorrelation between the variables is significantly positive up to 150km (r ⇡ 0.94, p ⇡ 0.0098) and is significantly negative (r ⇡ �0.98,p ⇡ 2.4 ⇥ 10�4) for larger distances.

to contribute to an increased number of administrative regionsoccupied and hence to a greater aggregate ergodicity. We plotthe fraction of ergodic clans versus the distance a clan hasmoved (which we estimate by calculating distances betweenclan-origin locations and the corresponding modern clan cen-troids) in Fig. 5(b). This further supports our claim that bothconvective and di↵usive transport have occurred. In additionto the fraction of ergodic clans, we also compared the radii ofgyration rg to the distance to Seoul and to the distance betweenlocation of clan origin and the present-day centroid, as shownin Fig. I6, and the latter shows the same tendency as the frac-tion of clans, which further supports our claim. We speculatethat the absence of statistical significance of the correlationbetween rg and the distance to Seoul is caused by the sam-pling problem, where many small clans are not included be-cause their centroids cannot be determined (see Appendix B).

We assume that clans that have moved a greater distancehave also existed for a longer time and hence have undergonedi↵usion longer; we thus also expect such clans to be more er-godic. This is consistent with our observations in Fig. 5(b) fordistances less than about 150 km, but it is di�cult to use thesame logic to explain our observations for distances greaterthan 150 km. However, if one assumes that long-distancemoves are more likely to arise from convective e↵ects thanfrom di↵usive ones, then our observations for both short andlong distances become understandable: the fraction of movesfrom bulk-flow e↵ects like resettlement or transplantation islarger for long-distance moves, and they become increasingly

dominant as the distance approaches 325 km (roughly the sizeof the Korean peninsula). We speculate that the clans thatmoved farther than 150 km are likely to be ones that origi-nated in the most remote areas of Korea, or even outside ofKorea, and that they have only relatively recently been trans-planted to major Korean population centers, from which theyhave had little time to spread. This observation is necessarilyspeculative because the age of a clan is not easy to determine:the first entry in a jokbo (see Table A-I for our ten jokbo) couldhave resulted from the invention of characters or printing de-vices rather than from the true birth of a clan [20].

Ultimately, our data are insu�cient to definitively ac-cept or reject the hypothesis of human di↵usion. However,as our analysis demonstrates, our data are consistent withthe theory of simultaneous human “di↵usion” and “convec-tion.” Furthermore, our analysis suggests that if the hypoth-esis of pure di↵usion is correct, then our estimated di↵u-sion constants indicate a possible time scale for relaxation toa dynamic equilibrium and thus for mixing in human soci-eties. In mainland South Korea, it would take approximately(100000 km2)/(1.5 km2/year) ⇡ 67000 years for purely dif-fusive mixing to produce a well-mixed society. A convectiveprocess thus appears to be playing the important role of pro-moting human interaction by accelerating mixing in the pop-ulation. In spite of such limitations, we try to estimate andquantify the centrality of Seoul in the context of a popula-tion flow network model and find other distinctive characteris-tics of the population flow patterns of ergodic and non-ergodicclans. For details, see Appendix H.

V. CONCLUSIONS

The long history of detailed record-keeping in Korean cul-ture provides an unusual opportunity for quantitative researchon historical human mobility and migration, and our inves-tigation strongly suggests that both “di↵usive” and “convec-tive” patterns have played important roles in establishing thecurrent distribution of clans in Korea.

By studying the geographical locations of clan origins injokbo (Korean family books), we have quantified the extent of“ergodicity” of Korean clans as reflected in time series of mar-riage snapshots. This illustrates the need to investigate the dis-tribution of individual clans in more details. Additionally, bycomparing our results from Korean clans to those from Czechfamilies, we have also demonstrated that these ideas can giveinsightful indications of di↵erent mobility and migration pat-terns in di↵erent cultures. Our ergodicity analysis using mod-ern census data clearly illustrates that there are both ergodicand non-ergodic clans, and we have used these results to sug-gest two di↵erent mechanisms for human migration on longtime scales. Many processes involve the attractiveness of thecenter versus di↵usion away from it, so this type of modelingframework can be applied to a lot of data sets, we believe.

A noteworthy feature of our analysis is that we used bothdata with high temporal resolution but low spatial resolution(jokbo data) and data with high spatial resolution but low tem-poral resolution (census data). This allowed us to consider

“diffusion” vs “convection” of human migration?

diffusion part?

본관본관

Page 32: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

7

distance from clan originlocation to Seoul (km)

fract

ion e

rgodic

(a)

0 3250.3

0.7

distance from clan origin locationto current clan centroid (km)

fract

ion e

rgodic

(b)

0 3250

0.7

FIG. 5. Fraction of ergodic clans and distance scales of clans. (a)Fraction of ergodic clans versus distance to Seoul. The correlationbetween the variables is positive and statistically significant. (ThePearson correlation coe�cient is r ⇡ 0.83, and the p-value is p ⇡0.0017.) For the purpose of this calculation, we call a clan “ergodic”if it is present in at least 150 administrative regions. We estimate thisfraction separately in each of 11 equally-sized bins over the displayedrange of distances. The gray regions give 95% confidence intervals.(b) Fraction of ergodic clans versus the distance between location ofclan origin and the present-day centroid. We measure ergodicity as inthe left panel, and we estimate the fraction separately for each rangeof binned distances. (We use the same bins as in the left panel.) Thecorrelation between the variables is significantly positive up to 150km (r ⇡ 0.94, p ⇡ 0.0098) and is significantly negative (r ⇡ �0.98,p ⇡ 2.4 ⇥ 10�4) for larger distances.

to contribute to an increased number of administrative regionsoccupied and hence to a greater aggregate ergodicity. We plotthe fraction of ergodic clans versus the distance a clan hasmoved (which we estimate by calculating distances betweenclan-origin locations and the corresponding modern clan cen-troids) in Fig. 5(b). This further supports our claim that bothconvective and di↵usive transport have occurred. In additionto the fraction of ergodic clans, we also compared the radii ofgyration rg to the distance to Seoul and to the distance betweenlocation of clan origin and the present-day centroid, as shownin Fig. I6, and the latter shows the same tendency as the frac-tion of clans, which further supports our claim. We speculatethat the absence of statistical significance of the correlationbetween rg and the distance to Seoul is caused by the sam-pling problem, where many small clans are not included be-cause their centroids cannot be determined (see Appendix B).

We assume that clans that have moved a greater distancehave also existed for a longer time and hence have undergonedi↵usion longer; we thus also expect such clans to be more er-godic. This is consistent with our observations in Fig. 5(b) fordistances less than about 150 km, but it is di�cult to use thesame logic to explain our observations for distances greaterthan 150 km. However, if one assumes that long-distancemoves are more likely to arise from convective e↵ects thanfrom di↵usive ones, then our observations for both short andlong distances become understandable: the fraction of movesfrom bulk-flow e↵ects like resettlement or transplantation islarger for long-distance moves, and they become increasingly

dominant as the distance approaches 325 km (roughly the sizeof the Korean peninsula). We speculate that the clans thatmoved farther than 150 km are likely to be ones that origi-nated in the most remote areas of Korea, or even outside ofKorea, and that they have only relatively recently been trans-planted to major Korean population centers, from which theyhave had little time to spread. This observation is necessarilyspeculative because the age of a clan is not easy to determine:the first entry in a jokbo (see Table A-I for our ten jokbo) couldhave resulted from the invention of characters or printing de-vices rather than from the true birth of a clan [20].

Ultimately, our data are insu�cient to definitively ac-cept or reject the hypothesis of human di↵usion. However,as our analysis demonstrates, our data are consistent withthe theory of simultaneous human “di↵usion” and “convec-tion.” Furthermore, our analysis suggests that if the hypoth-esis of pure di↵usion is correct, then our estimated di↵u-sion constants indicate a possible time scale for relaxation toa dynamic equilibrium and thus for mixing in human soci-eties. In mainland South Korea, it would take approximately(100000 km2)/(1.5 km2/year) ⇡ 67000 years for purely dif-fusive mixing to produce a well-mixed society. A convectiveprocess thus appears to be playing the important role of pro-moting human interaction by accelerating mixing in the pop-ulation. In spite of such limitations, we try to estimate andquantify the centrality of Seoul in the context of a popula-tion flow network model and find other distinctive characteris-tics of the population flow patterns of ergodic and non-ergodicclans. For details, see Appendix H.

V. CONCLUSIONS

The long history of detailed record-keeping in Korean cul-ture provides an unusual opportunity for quantitative researchon historical human mobility and migration, and our inves-tigation strongly suggests that both “di↵usive” and “convec-tive” patterns have played important roles in establishing thecurrent distribution of clans in Korea.

By studying the geographical locations of clan origins injokbo (Korean family books), we have quantified the extent of“ergodicity” of Korean clans as reflected in time series of mar-riage snapshots. This illustrates the need to investigate the dis-tribution of individual clans in more details. Additionally, bycomparing our results from Korean clans to those from Czechfamilies, we have also demonstrated that these ideas can giveinsightful indications of di↵erent mobility and migration pat-terns in di↵erent cultures. Our ergodicity analysis using mod-ern census data clearly illustrates that there are both ergodicand non-ergodic clans, and we have used these results to sug-gest two di↵erent mechanisms for human migration on longtime scales. Many processes involve the attractiveness of thecenter versus di↵usion away from it, so this type of modelingframework can be applied to a lot of data sets, we believe.

A noteworthy feature of our analysis is that we used bothdata with high temporal resolution but low spatial resolution(jokbo data) and data with high spatial resolution but low tem-poral resolution (census data). This allowed us to consider

“diffusion” vs “convection” of human migration?

convection part?diffusion part?

본관본관

Page 33: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

7

distance from clan originlocation to Seoul (km)

fract

ion e

rgodic

(a)

0 3250.3

0.7

distance from clan origin locationto current clan centroid (km)

fract

ion e

rgodic

(b)

0 3250

0.7

FIG. 5. Fraction of ergodic clans and distance scales of clans. (a)Fraction of ergodic clans versus distance to Seoul. The correlationbetween the variables is positive and statistically significant. (ThePearson correlation coe�cient is r ⇡ 0.83, and the p-value is p ⇡0.0017.) For the purpose of this calculation, we call a clan “ergodic”if it is present in at least 150 administrative regions. We estimate thisfraction separately in each of 11 equally-sized bins over the displayedrange of distances. The gray regions give 95% confidence intervals.(b) Fraction of ergodic clans versus the distance between location ofclan origin and the present-day centroid. We measure ergodicity as inthe left panel, and we estimate the fraction separately for each rangeof binned distances. (We use the same bins as in the left panel.) Thecorrelation between the variables is significantly positive up to 150km (r ⇡ 0.94, p ⇡ 0.0098) and is significantly negative (r ⇡ �0.98,p ⇡ 2.4 ⇥ 10�4) for larger distances.

to contribute to an increased number of administrative regionsoccupied and hence to a greater aggregate ergodicity. We plotthe fraction of ergodic clans versus the distance a clan hasmoved (which we estimate by calculating distances betweenclan-origin locations and the corresponding modern clan cen-troids) in Fig. 5(b). This further supports our claim that bothconvective and di↵usive transport have occurred. In additionto the fraction of ergodic clans, we also compared the radii ofgyration rg to the distance to Seoul and to the distance betweenlocation of clan origin and the present-day centroid, as shownin Fig. I6, and the latter shows the same tendency as the frac-tion of clans, which further supports our claim. We speculatethat the absence of statistical significance of the correlationbetween rg and the distance to Seoul is caused by the sam-pling problem, where many small clans are not included be-cause their centroids cannot be determined (see Appendix B).

We assume that clans that have moved a greater distancehave also existed for a longer time and hence have undergonedi↵usion longer; we thus also expect such clans to be more er-godic. This is consistent with our observations in Fig. 5(b) fordistances less than about 150 km, but it is di�cult to use thesame logic to explain our observations for distances greaterthan 150 km. However, if one assumes that long-distancemoves are more likely to arise from convective e↵ects thanfrom di↵usive ones, then our observations for both short andlong distances become understandable: the fraction of movesfrom bulk-flow e↵ects like resettlement or transplantation islarger for long-distance moves, and they become increasingly

dominant as the distance approaches 325 km (roughly the sizeof the Korean peninsula). We speculate that the clans thatmoved farther than 150 km are likely to be ones that origi-nated in the most remote areas of Korea, or even outside ofKorea, and that they have only relatively recently been trans-planted to major Korean population centers, from which theyhave had little time to spread. This observation is necessarilyspeculative because the age of a clan is not easy to determine:the first entry in a jokbo (see Table A-I for our ten jokbo) couldhave resulted from the invention of characters or printing de-vices rather than from the true birth of a clan [20].

Ultimately, our data are insu�cient to definitively ac-cept or reject the hypothesis of human di↵usion. However,as our analysis demonstrates, our data are consistent withthe theory of simultaneous human “di↵usion” and “convec-tion.” Furthermore, our analysis suggests that if the hypoth-esis of pure di↵usion is correct, then our estimated di↵u-sion constants indicate a possible time scale for relaxation toa dynamic equilibrium and thus for mixing in human soci-eties. In mainland South Korea, it would take approximately(100000 km2)/(1.5 km2/year) ⇡ 67000 years for purely dif-fusive mixing to produce a well-mixed society. A convectiveprocess thus appears to be playing the important role of pro-moting human interaction by accelerating mixing in the pop-ulation. In spite of such limitations, we try to estimate andquantify the centrality of Seoul in the context of a popula-tion flow network model and find other distinctive characteris-tics of the population flow patterns of ergodic and non-ergodicclans. For details, see Appendix H.

V. CONCLUSIONS

The long history of detailed record-keeping in Korean cul-ture provides an unusual opportunity for quantitative researchon historical human mobility and migration, and our inves-tigation strongly suggests that both “di↵usive” and “convec-tive” patterns have played important roles in establishing thecurrent distribution of clans in Korea.

By studying the geographical locations of clan origins injokbo (Korean family books), we have quantified the extent of“ergodicity” of Korean clans as reflected in time series of mar-riage snapshots. This illustrates the need to investigate the dis-tribution of individual clans in more details. Additionally, bycomparing our results from Korean clans to those from Czechfamilies, we have also demonstrated that these ideas can giveinsightful indications of di↵erent mobility and migration pat-terns in di↵erent cultures. Our ergodicity analysis using mod-ern census data clearly illustrates that there are both ergodicand non-ergodic clans, and we have used these results to sug-gest two di↵erent mechanisms for human migration on longtime scales. Many processes involve the attractiveness of thecenter versus di↵usion away from it, so this type of modelingframework can be applied to a lot of data sets, we believe.

A noteworthy feature of our analysis is that we used bothdata with high temporal resolution but low spatial resolution(jokbo data) and data with high spatial resolution but low tem-poral resolution (census data). This allowed us to consider

“diffusion” vs “convection” of human migration?

convection part?diffusion part?

cannot be expected in a purely diffusive system!

본관본관

Page 34: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Summary and Outlook• Human migration (long time scale) throughout history by

combining two complementary data sets:Korean 족보 (good temporal resolution) +modern census data (good spatial resolution)

• marriage pattern (short time scale: “snapshots”) based on 족보: describable by simple human migration models?

• the “ergodicity” (a result of migration) of 본관

• diffusive vs convective migration

• More data (more 족보, census, other countries’ data) → better model/prediction

Thank you for your attention!SHL, R. Ffrancon, D. M. Abrams, B. J. Kim (김범준), and M. A. Porter, Phys. Rev. X 4, 041009 (2014).

Page 35: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Hawoong [ Jeong (from Dongrae)]: [(동래)정]하웅 (족보 data), Josef Novotný (Czech family name data)

Acknowledgments

Mason Porter(University of Oxford)

Beom Jun [Kim (from Gimhae)]: [(김해)김]범준 (Sungkyunkwan University)

Danny Abrams(Northwestern University)

Robyn Ffrancon(University of Gothenburg)

SHL, R. Ffrancon, D. M. Abrams, B. J. Kim (김범준), and M. A. Porter, Phys. Rev. X 4, 041009 (2014).

Page 36: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Hawoong [ Jeong (from Dongrae)]: [(동래)정]하웅 (족보 data), Josef Novotný (Czech family name data)

Acknowledgments

Mason Porter(University of Oxford)

Beom Jun [Kim (from Gimhae)]: [(김해)김]범준 (Sungkyunkwan University)

Danny Abrams(Northwestern University)

Robyn Ffrancon(University of Gothenburg)

easy to determine. The first entry in a jokbo (see Table I forour ten jokbo) could have resulted from the invention ofcharacters or printing devices rather than from the true birthof a clan [20].Ultimately, our data are insufficient to definitively accept

or reject the hypothesis of human diffusion. However, asour analysis demonstrates, our data are consistent with thetheory of simultaneous human “diffusion” and “convec-tion.” Furthermore, our analysis suggests that if the hypoth-esis of pure diffusion is correct, then our estimated diffusionconstants indicate a possible time scale for relaxation to adynamic equilibrium and thus for mixing in human soci-eties. In mainland South Korea, it would take approximatelyð100 000 km2Þ=ð1.5 km2=yearÞ ≈ 67 000 years for purelydiffusive mixing to produce a well-mixed society. Aconvective process thus appears to be playing the importantrole of promoting human interaction by accelerating mixingin the population. Despite the limitations imposed by ourdata, we try to estimate and quantify the centrality of Seoulusing a network-flow model for population, and we findsuggestive differences between the flow patterns of ergodicand nonergodic clans. For details, see Appendix H.

V. CONCLUSIONS

The long history of detailed record-keeping in Koreanculture provides an unusual opportunity for quantitativeresearch on historical human mobility and migration, andour investigation strongly suggests that both “diffusive”and “convective” patterns have played important rolesin establishing the current distribution of clans in Korea.By studying the geographical locations of clan originsin jokbo (Korean family books), we have quantified theextent of “ergodicity” of Korean clans as reflected intime series of marriage snapshots. This underscores theutility of investigating the location distributions of indi-vidual clans. Additionally, by comparing our results fromKorean clans to those from Czech families, we have alsodemonstrated that our approach can give insightful indi-cations of different mobility and migration patterns indifferent cultures. Our ergodicity analysis using moderncensus data clearly illustrates that there are both ergodicand nonergodic clans, and we have used these results tosuggest two different mechanisms for human migration onlong time scales. Many mobility processes involve abalance between diffusive spreading and the attractivenessof one or more central locations (and between more generaldiffusive and convective fluxes), so we believe that ourapproach in the present paper will be valuable for manysituations.A noteworthy feature of our analysis is that we used both

data with high temporal resolution but low spatial reso-lution (jokbo data) and data with high spatial resolution butlow temporal resolution (census data). This allowed us toconsider both the patterns of human movement on shorttime scales (mobility via individual marriage processes)

and their consequences for human locations on long timescales (human migration via clan ergodicity). An in-teresting further wrinkle would be to compare suchmobility-derived time scales for human mixing patternsto genetically-derived patterns [56–59].From a more general perspective, our research has

allowed us to test the idea of using a physical analogyfor modeling human migration—an idea put forth (but notquantified) as early as the 19th century [1–10]. Physics-inspired ideas have been very successful for the study ofhuman mobility, which occurs on shorter time scales thanhuman migration, and we propose that Ravenstein wascorrect when he posited that such ideas are also useful forhuman migration.

ACKNOWLEDGMENTS

We thank Hawoong Jeong (정하웅) for providing datafrom Korean family books and Josef Novotný for providingdata on surnames in the Czech Republic. We thank TimEvans for introducing us to helpful references and ErikBollt, Valentin Danchev, Sandra González-Bailón, JamesIrish, Philip Kreager, Michael Murphy, and TommyMurphy for helpful comments and discussions. We thankMarc Barthelemy and Richard Morris for details about theirwork on constructing flow networks [64], and we thank theanonymous referees for their helpful comments and sug-gestions. M. A. P. and S. H. L. acknowledge support fromthe Engineering and Physical Sciences Research Council(EPSRC) through Grant No. EP/J001759/1. B. J. K. wassupported by a National Research Foundation of Korea(NRF) grant funded by the Korean government(No. 2014R1A2A2A01004919). D. M. A. was supportedby Grant No. 220020230 from the James S. McDonnellFoundation. S. H. L. did the majority of his work atUniversity of Oxford.S. H. Lee and R. Ffrancon contributed equally to

this work.

APPENDIX A: JOKBO DATA

In our investigation, we examine ten digitized jokbothat were first studied in Ref. [21]. In Table I in themain text, we give basic information about the ten jokbo,and we now summarize the results of some of ourcomputations.First, we apply the same gravity-model fit that we used

for jokbo 1 to all of the jokbo data, and the results do notdeviate much from those for jokbo 1. That is, γ ≈ 0 andα ≈ 1, so we can apply the population-product model withα ¼ 1. The largest deviations in the two parameter valuesare α ≈ 1.4930 (for jokbo 7) and γ ≈ 0.5377 (for jokbo 6).Interestingly, we could not find any empirical value ofγ < 0.6 reported in the literature [4,5,45–49], and it seemsto be extremely rare to report any empirical values at all forgravity-model parameters. As one can see in Fig. 6, the

MATCHMAKER, MATCHMAKER, MAKE ME A MATCH: … PHYS. REV. X 4, 041009 (2014)

041009-9

SHL, R. Ffrancon, D. M. Abrams, B. J. Kim (김범준), and M. A. Porter, Phys. Rev. X 4, 041009 (2014).

Page 37: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

introduced in . . .

Sang  Hoon  Lee/Sungkyunkwan  University

Matchmaker,  Matchmaker,  Make  Me  a  Match:  Migration  of  Populations  via  Marriages  in  the  Past

Sang  Hoon  Lee  ( ),  Robyn  Ffrancon,  Daniel  M.  Abrams,  Beom  Jun  Kim  ( ),  and  Mason  A.  PorterPhys.  Rev.  X  4,  041009  (2014)Published  October  16,  2014

Synopsis:  Wedding  Registries  Reveal  MigrationPaths

Patterns  in  human  movement  have  been  studied  using  traffic  data,  mobile  phone  records,  and  even  dollar  billcirculation.  A  new  investigation  uses  centuries-­old  Korean  family  books,  called  jokbos  (  in  Korean),  as  arecord  of  emigration  by  female  brides.  The  analysis  of  this  movement,  as  well  as  more  recent  census  data,shows  that  both  diffusion-­  and  convection-­like  phenomena  may  play  a  role  in  mixing  different  populations.

Understanding  human  mobility  is  important  for  improving  city  planning  and  for  responding  to  outbreaks  ofdisease.  Previous  modeling  work  has  suggested  that  the  statistical  patterns  in  human  movement  resemblecertain  physical  phenomena,  such  as  the  diffusion  of  molecules  in  liquids.  However,  these  patterns  are  drawnprimarily  from  short-­term  (a  day  to  at  most  a  year)  tracking  data,  so  it’s  unclear  whether  the  same  models  applyto  longer-­term  migrations  that  affect  cultural  and  genetic  dissemination.

In  Korea,  marriages  have  traditionally  involved  the  relocation  of  the  bride  to  her  groom’s  home.  Sang  Hoon  Leeof  Sungkyunkwan  University,  Korea,  and  his  colleagues  analyzed    of  these  marriages  catalogued  inten  jokbos  that  date  as  far  back  as  the   th  century.  The  books  don’t  provide  specific  locations,  so  theresearchers  inferred  migration  paths  from  the  geographic  regions  associated  with  each  bride  and  groom’s  clannames.  Because  the  bride  migration  rate  between  regions  was  primarily  dependent  on  clan  populationdensities,  the  team  tried  to  explain  migration  with  a  model  in  which  Koreans  moved  randomly  away  (diffused)from  high  concentrations  of  their  own  clan.  However,  the  estimated  diffusion  rate  was  too  slow  to  explain  howwell  certain  clans  have  spread  throughout  the  country.  The  team  concluded  that  a  directed  convective-­like  flowtowards  the  capital  city  Seoul  enhanced  the  mixing  rate.  According  to  the  authors,  the  combination  of  diffusionand  convection  may  explain  other  human  flow  patterns.

This  research  is  published  in  Physical  Review  X.

–Michael  Schirber

ISSN  1943-­2879.  Use  of  the  American  Physical  Society  websites  and  journals  implies  that  the  user  has  readand  agrees  to  our  Terms  and  Conditions  and  any  applicable  Subscription  Agreement.

200 00013

Page 38: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

introduced in . . .

Sang  Hoon  Lee/Sungkyunkwan  University

Matchmaker,  Matchmaker,  Make  Me  a  Match:  Migration  of  Populations  via  Marriages  in  the  Past

Sang  Hoon  Lee  ( ),  Robyn  Ffrancon,  Daniel  M.  Abrams,  Beom  Jun  Kim  ( ),  and  Mason  A.  PorterPhys.  Rev.  X  4,  041009  (2014)Published  October  16,  2014

Synopsis:  Wedding  Registries  Reveal  MigrationPaths

Patterns  in  human  movement  have  been  studied  using  traffic  data,  mobile  phone  records,  and  even  dollar  billcirculation.  A  new  investigation  uses  centuries-­old  Korean  family  books,  called  jokbos  (  in  Korean),  as  arecord  of  emigration  by  female  brides.  The  analysis  of  this  movement,  as  well  as  more  recent  census  data,shows  that  both  diffusion-­  and  convection-­like  phenomena  may  play  a  role  in  mixing  different  populations.

Understanding  human  mobility  is  important  for  improving  city  planning  and  for  responding  to  outbreaks  ofdisease.  Previous  modeling  work  has  suggested  that  the  statistical  patterns  in  human  movement  resemblecertain  physical  phenomena,  such  as  the  diffusion  of  molecules  in  liquids.  However,  these  patterns  are  drawnprimarily  from  short-­term  (a  day  to  at  most  a  year)  tracking  data,  so  it’s  unclear  whether  the  same  models  applyto  longer-­term  migrations  that  affect  cultural  and  genetic  dissemination.

In  Korea,  marriages  have  traditionally  involved  the  relocation  of  the  bride  to  her  groom’s  home.  Sang  Hoon  Leeof  Sungkyunkwan  University,  Korea,  and  his  colleagues  analyzed    of  these  marriages  catalogued  inten  jokbos  that  date  as  far  back  as  the   th  century.  The  books  don’t  provide  specific  locations,  so  theresearchers  inferred  migration  paths  from  the  geographic  regions  associated  with  each  bride  and  groom’s  clannames.  Because  the  bride  migration  rate  between  regions  was  primarily  dependent  on  clan  populationdensities,  the  team  tried  to  explain  migration  with  a  model  in  which  Koreans  moved  randomly  away  (diffused)from  high  concentrations  of  their  own  clan.  However,  the  estimated  diffusion  rate  was  too  slow  to  explain  howwell  certain  clans  have  spread  throughout  the  country.  The  team  concluded  that  a  directed  convective-­like  flowtowards  the  capital  city  Seoul  enhanced  the  mixing  rate.  According  to  the  authors,  the  combination  of  diffusionand  convection  may  explain  other  human  flow  patterns.

This  research  is  published  in  Physical  Review  X.

–Michael  Schirber

ISSN  1943-­2879.  Use  of  the  American  Physical  Society  websites  and  journals  implies  that  the  user  has  readand  agrees  to  our  Terms  and  Conditions  and  any  applicable  Subscription  Agreement.

200 00013

Page 39: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

introduced in . . .

Sang  Hoon  Lee/Sungkyunkwan  University

Matchmaker,  Matchmaker,  Make  Me  a  Match:  Migration  of  Populations  via  Marriages  in  the  Past

Sang  Hoon  Lee  ( ),  Robyn  Ffrancon,  Daniel  M.  Abrams,  Beom  Jun  Kim  ( ),  and  Mason  A.  PorterPhys.  Rev.  X  4,  041009  (2014)Published  October  16,  2014

Synopsis:  Wedding  Registries  Reveal  MigrationPaths

Patterns  in  human  movement  have  been  studied  using  traffic  data,  mobile  phone  records,  and  even  dollar  billcirculation.  A  new  investigation  uses  centuries-­old  Korean  family  books,  called  jokbos  (  in  Korean),  as  arecord  of  emigration  by  female  brides.  The  analysis  of  this  movement,  as  well  as  more  recent  census  data,shows  that  both  diffusion-­  and  convection-­like  phenomena  may  play  a  role  in  mixing  different  populations.

Understanding  human  mobility  is  important  for  improving  city  planning  and  for  responding  to  outbreaks  ofdisease.  Previous  modeling  work  has  suggested  that  the  statistical  patterns  in  human  movement  resemblecertain  physical  phenomena,  such  as  the  diffusion  of  molecules  in  liquids.  However,  these  patterns  are  drawnprimarily  from  short-­term  (a  day  to  at  most  a  year)  tracking  data,  so  it’s  unclear  whether  the  same  models  applyto  longer-­term  migrations  that  affect  cultural  and  genetic  dissemination.

In  Korea,  marriages  have  traditionally  involved  the  relocation  of  the  bride  to  her  groom’s  home.  Sang  Hoon  Leeof  Sungkyunkwan  University,  Korea,  and  his  colleagues  analyzed    of  these  marriages  catalogued  inten  jokbos  that  date  as  far  back  as  the   th  century.  The  books  don’t  provide  specific  locations,  so  theresearchers  inferred  migration  paths  from  the  geographic  regions  associated  with  each  bride  and  groom’s  clannames.  Because  the  bride  migration  rate  between  regions  was  primarily  dependent  on  clan  populationdensities,  the  team  tried  to  explain  migration  with  a  model  in  which  Koreans  moved  randomly  away  (diffused)from  high  concentrations  of  their  own  clan.  However,  the  estimated  diffusion  rate  was  too  slow  to  explain  howwell  certain  clans  have  spread  throughout  the  country.  The  team  concluded  that  a  directed  convective-­like  flowtowards  the  capital  city  Seoul  enhanced  the  mixing  rate.  According  to  the  authors,  the  combination  of  diffusionand  convection  may  explain  other  human  flow  patterns.

This  research  is  published  in  Physical  Review  X.

–Michael  Schirber

ISSN  1943-­2879.  Use  of  the  American  Physical  Society  websites  and  journals  implies  that  the  user  has  readand  agrees  to  our  Terms  and  Conditions  and  any  applicable  Subscription  Agreement.

200 00013

Page 40: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

introduced in . . .

Sang  Hoon  Lee/Sungkyunkwan  University

Matchmaker,  Matchmaker,  Make  Me  a  Match:  Migration  of  Populations  via  Marriages  in  the  Past

Sang  Hoon  Lee  ( ),  Robyn  Ffrancon,  Daniel  M.  Abrams,  Beom  Jun  Kim  ( ),  and  Mason  A.  PorterPhys.  Rev.  X  4,  041009  (2014)Published  October  16,  2014

Synopsis:  Wedding  Registries  Reveal  MigrationPaths

Patterns  in  human  movement  have  been  studied  using  traffic  data,  mobile  phone  records,  and  even  dollar  billcirculation.  A  new  investigation  uses  centuries-­old  Korean  family  books,  called  jokbos  (  in  Korean),  as  arecord  of  emigration  by  female  brides.  The  analysis  of  this  movement,  as  well  as  more  recent  census  data,shows  that  both  diffusion-­  and  convection-­like  phenomena  may  play  a  role  in  mixing  different  populations.

Understanding  human  mobility  is  important  for  improving  city  planning  and  for  responding  to  outbreaks  ofdisease.  Previous  modeling  work  has  suggested  that  the  statistical  patterns  in  human  movement  resemblecertain  physical  phenomena,  such  as  the  diffusion  of  molecules  in  liquids.  However,  these  patterns  are  drawnprimarily  from  short-­term  (a  day  to  at  most  a  year)  tracking  data,  so  it’s  unclear  whether  the  same  models  applyto  longer-­term  migrations  that  affect  cultural  and  genetic  dissemination.

In  Korea,  marriages  have  traditionally  involved  the  relocation  of  the  bride  to  her  groom’s  home.  Sang  Hoon  Leeof  Sungkyunkwan  University,  Korea,  and  his  colleagues  analyzed    of  these  marriages  catalogued  inten  jokbos  that  date  as  far  back  as  the   th  century.  The  books  don’t  provide  specific  locations,  so  theresearchers  inferred  migration  paths  from  the  geographic  regions  associated  with  each  bride  and  groom’s  clannames.  Because  the  bride  migration  rate  between  regions  was  primarily  dependent  on  clan  populationdensities,  the  team  tried  to  explain  migration  with  a  model  in  which  Koreans  moved  randomly  away  (diffused)from  high  concentrations  of  their  own  clan.  However,  the  estimated  diffusion  rate  was  too  slow  to  explain  howwell  certain  clans  have  spread  throughout  the  country.  The  team  concluded  that  a  directed  convective-­like  flowtowards  the  capital  city  Seoul  enhanced  the  mixing  rate.  According  to  the  authors,  the  combination  of  diffusionand  convection  may  explain  other  human  flow  patterns.

This  research  is  published  in  Physical  Review  X.

–Michael  Schirber

ISSN  1943-­2879.  Use  of  the  American  Physical  Society  websites  and  journals  implies  that  the  user  has  readand  agrees  to  our  Terms  and  Conditions  and  any  applicable  Subscription  Agreement.

200 00013