13
t.ó ANNALS OF HUMAN BIOLOGY, 2001, VOL. 28, No.3, 295-307 Human mitochondrial DNA sequencevariation in the Moroccan population of the Souss area z. BRAKEZt, E. BOSCHt*, H. IzAABELt, O. AKHAYATt, D. COMASt, J. BERTRANPETITt and F. CALAFELLt " t Laboratoire de Biologie Cellulaire et Moléculaire, Faculté des Scien'ces, t{piversité Ibnou-Zohr, Agadir, Morocco t Unitat de Biologia Evolutiva, Facultat de Ciencies de la Salut i de la Vida, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain Received 15 February 2000; accepted 29 June 2000 Summary. Background: Various populations have contributed to the present-day gene pool of Morocco, including the autochthonous Berber population, Phoenicians, Sephardic Jews, Bedouin Arabs and sub-Saharan Africans. Objective: The primary objective of the study was to complete a genetic description of the Berber-speaking population in the Souss region of southern Morocco, based on mitochon- drial DNA (mtDNA) sequence analysis. Subjects and methods: The first hypervariable segment of the mtDNA control region was sequenced in a sample of 50 individuals from the Souss Valley, and the results compared with the extensive body of data available on mtDNA sequence variation in Europe and sub- Saharan Africa. Results: Thirty-four different sequences were found; an estimated 68% of the sequences occurred throughout Europe, West Asia and North Africa, 26% originated in sub- Saharan Africa, and 6% belonged to the North African specific haplogroup U6. The Souss Valley mtDNA sequences indicated the presence of two populations which expanded at different times: the West Eurasian sequences in the Souss sample had a smaller average number of pairwise differences than pairs of sub-Saharan sequences. Conclusion: Detailed knowledge of the possible geographic origin of each sequence facili- tated an interpretation of both internal diversity parameters and between-population rela- tionships. The sub-Saharan admixture in the Souss Valley matched the south-north cline of sub-Saharan influence in North Africa, also evideht in the genetic distances of North African populations to Europeans and sub-Saharan Africans. 1. Introduction Morocco is located in the extreme north-westem comer of Africa; the present Moroccan population is made up of both Berber- and Arabic-speaking groups whose language belongs to the Afroasiatic family (Ruhlen 1987). Berbers are des- cendants of the autochthonous population of Morocco. Neolithic farmers may have introduced their Afroasiatic language (Renfrew 1991, Barbujani, Pilastro, De Domenico et al. 1994) into North Africa, which has since differentiated into 30 different languages, spoken from Egypt to Morocco and from the Gibraltar Straits to northem Senegal. Since that time, the first important invasion was that of the P~oenicians coming from the East Mediterranean sea-coast around 1000 BC, who represented almost one-tenth ofthe North African population at the time ofthe Roman conquest, in 146 BC (Julian 1961, Gaid 1990). After the destruction of the temple of Jerusalem, Jewish communities also settled in many North African local- ities during the lst century AD. By the end ofthe 7th century, Arab armies arrived to * Present address: Department of Genetics, University of Leicester, Leicester, U K Annals ofHuman Biology ISSN 0301-4460 print/ISSN 1464-5033 online @ 2001 Taylor & Francis Ltd http./ /www.tandf.co.uk/jonrnals

t.ó - Institut de Biologia Evolutivabiologiaevolutiva.org/jbertranpetit/wp-content/uploads/2014/11/JB... · expand the Islamic religion and the Arabic language into North ... the

Embed Size (px)

Citation preview

Page 1: t.ó - Institut de Biologia Evolutivabiologiaevolutiva.org/jbertranpetit/wp-content/uploads/2014/11/JB... · expand the Islamic religion and the Arabic language into North ... the

t.óANNALS OF HUMAN BIOLOGY, 2001, VOL. 28, No.3, 295-307

Human mitochondrial DNA sequence variation in the Moroccanpopulation of the Souss area

z. BRAKEZt, E. BOSCHt*, H. IzAABELt, O. AKHAYATt, D. COMASt, J. BERTRANPETITtand F. CALAFELLt

"t Laboratoire de Biologie Cellulaire et Moléculaire, Faculté des Scien'ces, t{piversité Ibnou-Zohr, Agadir,

Moroccot Unitat de Biologia Evolutiva, Facultat de Ciencies de la Salut i de la Vida, Universitat Pompeu Fabra,

Barcelona, Catalonia, SpainReceived 15 February 2000; accepted 29 June 2000

Summary. Background: Various populations have contributed to the present-day genepool of Morocco, including the autochthonous Berber population, Phoenicians,Sephardic Jews, Bedouin Arabs and sub-Saharan Africans.Objective: The primary objective of the study was to complete a genetic description of theBerber-speaking population in the Souss region of southern Morocco, based on mitochon-drial DNA (mtDNA) sequence analysis.Subjects and methods: The first hypervariable segment of the mtDNA control region wassequenced in a sample of 50 individuals from the Souss Valley, and the results comparedwith the extensive body of data available on mtDNA sequence variation in Europe and sub-Saharan Africa.Results: Thirty-four different sequences were found; an estimated 68% of the sequencesoccurred throughout Europe, West Asia and North Africa, 26% originated in sub-Saharan Africa, and 6% belonged to the North African specific haplogroup U6. TheSouss Valley mtDNA sequences indicated the presence of two populations which expandedat different times: the West Eurasian sequences in the Souss sample had a smaller averagenumber of pairwise differences than pairs of sub-Saharan sequences.Conclusion: Detailed knowledge of the possible geographic origin of each sequence facili-tated an interpretation of both internal diversity parameters and between-population rela-tionships. The sub-Saharan admixture in the Souss Valley matched the south-north cline ofsub-Saharan influence in North Africa, also evideht in the genetic distances of NorthAfrican populations to Europeans and sub-Saharan Africans.

1. IntroductionMorocco is located in the extreme north-westem comer of Africa; the present

Moroccan population is made up of both Berber- and Arabic-speaking groupswhose language belongs to the Afroasiatic family (Ruhlen 1987). Berbers are des-cendants of the autochthonous population of Morocco. Neolithic farmers may haveintroduced their Afroasiatic language (Renfrew 1991, Barbujani, Pilastro, DeDomenico et al. 1994) into North Africa, which has since differentiated into 30different languages, spoken from Egypt to Morocco and from the GibraltarStraits to northem Senegal. Since that time, the first important invasion was thatof the P~oenicians coming from the East Mediterranean sea-coast around 1000 BC,who represented almost one-tenth ofthe North African population at the time oftheRoman conquest, in 146 BC (Julian 1961, Gaid 1990). After the destruction of thetemple of Jerusalem, Jewish communities also settled in many North African local-ities during the lst century AD. By the end ofthe 7th century, Arab armies arrived to

* Present address: Department of Genetics, University of Leicester, Leicester, U K

Annals ofHuman Biology ISSN 0301-4460 print/ISSN 1464-5033 online @ 2001 Taylor & Francis Ltd

http./ /www.tandf.co.uk/jonrnals

Page 2: t.ó - Institut de Biologia Evolutivabiologiaevolutiva.org/jbertranpetit/wp-content/uploads/2014/11/JB... · expand the Islamic religion and the Arabic language into North ... the

296 z. Brakez et al

expand the Islamic religion and the Arabic language into North Africa. It had astrong cultural importance but the demographic impact does not seem to have beenof quantitative importance (Newman 1995). The late 10th century witnessed animportant movement of Arab populations, which continued for centuries. Thosewere mostly Bedouin Arabs (Brignon, Amine, Boutaleb et al. 1967). From theirinteractions with the Berbers arose the Moorish culture, which extended both tothe south into the Sahara and north into Spain. Other immigrants carne graduallyfrom the north and from the south (Black slaves from'Sudan and Sahel). All thesepopulations could have contributed to the present Mo~occan gene pool.

Berbers from Morocco presently belong to three main communities: NorthernBerbers live in the Rif mountains, are known as Rifis and speak Tarifit; CentralBerbers live in the High Atlas, are called Imazighs and speak Tamazight; andSouthern Berbers live in the Anti-Atlas and the Souss valley, are known asChleuh and speak Tachelhit. According to Murdock (1959), Berber languages fallinto several groups, among which the Tachelhit include many tribes of the GrandAtlas, the Anti-Atlas, the intervening val1ey of the Souss River and the adjacentcoast of Morocco.

Genetic ana1ysis has been ab1e to clarify many popu1ation history issues, be itthrough c1assica1 genetic markers (Excoffier, Pe11egrini, Sánchez- Mazas et al. 1987,Caval1i-Sforza, Menozzi and Piazza 1994) or, more recently, at the DNA level.Mitochondrial DNA (mtDNA) is at present the object of study in many popu1ationsfrom different geographica1 regions, and has been particularly used for studyingEuropean populations (Comas, Ca1afe11, Mateu et al. 1997, Simoni, Calafell,Pettener et al. 2000 and over 30 references therein), and, to a lesser extent,African populations (Vigilant, Stoneking, Harpending et al. 1991, Graven,Passarino, Semino et al. 1995, Watson, Bauer, Aman et al. 1996, Watson, Forster,Richards et al. 1997, Mateu, Comas, Calafell et al. 1997, Rando, Pinto, Gonzálezet al. 1998 Krings, Salem, Bauer et al. 1999). A few studies have centred on NorthAfrican populations: C6rte-Real et al. (1996) a.nalysed the Berber Mzabites fromAlgeria (further analysed by Macaulay, Richards, Hickey et al. 1999), Krings et al.(1999) the Egyptians and Nubians, and Rando et al. (1998) surveyed the Saharawi,Mauritanian, Moroccan Arab and Moroccan Berber. The latter sample pooledNorthern (Rif) and Central (Tamazigh) Moroccan Berbers, with few, if any, indi-viduals from the Souss Valley in the South, where Tachelhit Berbers live.

Genetic variation at the mtDNA has been studied with different techniques,mainly through high-resolution Restriction Fragment Length Polymorphism(RFLP) analysis (Torroni, Schurr, Yang et al. 1992), which allows to definebroad, continent-specific groups of sequences or haplogroups. A different approachconsists of sequencing the hypervariable fragments of the control region (Vigilantet al. 1991). It has been shown that most haplogro~s can be recognized by specificnucleotid'e motifs in the control region (Torroni, Huoponen, Erancalacci et al. 1996,Macaulay et al. 1999). The haplogroups constituting the mtDNA pool of sub-Saharan Africa have been well established (Chen, Torroni, Excoffier et al. 1995),as well as their correspondence with control region motifs (Watson et al. 1997,Rando et al. 1998, Macaulay et al. 1999); these haplogroups have rarely beenfound in individuals of non-African descent. This is also the case with European(and, more generaíly, West Eurasian) haplogroups (Torroni et al. 1996, Macaulayet al. 1999). Thus, in almost all cases, the sequence motifs in the control region can

Page 3: t.ó - Institut de Biologia Evolutivabiologiaevolutiva.org/jbertranpetit/wp-content/uploads/2014/11/JB... · expand the Islamic religion and the Arabic language into North ... the

mtDNA sequences in the Souss Va/ley, Morocco 297

point to a sub-Saharan African or to a West Eurasian origin for any given mtDNAsequence. In this way, Rando et al. (1998) found that up to a quarter of the North-West African maternallineages originated South of the Sahara.

We have analysed the control region of mtDNA of 50 unrelated individuals alloriginating from the Souss area (in the south of Morocco ), in a population that hadpreviously been characterized for HLA variation (Izaabel, Garchon, Caillat-Zucmanet al. 1998), in order to describe their matrilineal genetic variability and to completethe picture of the genetics of Moroccan populations. ~ '"'

2. Material and methods2.1. DNA extraction, amplification and sequencing

DNA was extracted, from fresh blood (5-10mL in EDTA tube) of 50 unrelatedMoroccan individuals of the Souss area, by alkali bursting of blood cells followed byprotein digestion with proteinase K (Grimberg, Nawoschik, Belluscio et al. 1989).After a phenol--chloroform extraction, DNA was precipitated with absolute ethanoland dried at room temperature. The DNA pellet was diluted in 200 ¡:1L TE (Tris HCl,10 mM, pH 7.5; EDTA 1 mM).

Amplification was performed in Perkin Elmer 9600 thermal cycler using 200 ngDNA in 25 ¡:1L reaction volume; the temperature profile for 30 cycles of amplificationwas 94°C for 1 min, 58°C for 1 min and 72°C for 1 min. The primers used in thisreaction, L 15996 (5'-CTCCACCATTAGCACCCAAAGC-3') and H 16401 (5'-TGATTTCA CGGAGGATGGTG-3'), amplified a segment of 446 base pairs(bp) containing th~ 360 bp hypervariable region that was subsequently sequenced.

The product of the amplification was purified with the Gene Clean kit (BIO 101)before the cycle sequencing. The sequencing reactions were performed separately oneach stránd with the primers L 15996 and H 16401 and were carried out using asequencing kit (Terminator ready Rxn with Ampli Taq FS). The sequencing profilefor 25 cycles was 96°C for 10 s, 50°C for 5 s and 60°C for 4min. The product of thesequence reaction was precipitated in absolute ethanol, dried at room temperatureand run in an ABI PRISM 377 (Perkin Elmer) automatic sequencer.

2.2. Numerical analysesSequences were aligned with the ESEE program (Cabot 1988) and the segment

from positions 16024 to 16383 (Anderson, Bankier, Barrell et al. 1981) was used foranalysis. The final information for each individual was a string of 360 characters forthe bases from positions 16024 to 16383 corresponding to hypervariable region I.The data were analysed by standard packages, especially PHYLIP 3.5c (Felsenstein1989) and Arlequin (Schneider, Kueffer and Excoffier 1997). Distances betweensequences ~ere computed with the DNADIST program in the PHYLIP packageaccording to the Kimura's two-parameter model, with the transition to transversionratio set to 15:1 according to Tamura and Nei (1993). A neighbour-joining tree(Saitou and Nei 1987) was built from a sequence distance matrix. Phylogenetic

relationships among sequences were also displayed by means of a median network(Bandelt, Forster, Sykes et al. 1995).

Control region sequences were assigned to RFLP-defined haplogroups by firstcomparing their sequences to those in the datasets where mtDNA had been typedboth for RFLPs and for HVR-I sequences (Torroni et al. 1996, Watson et al. 1997.

Page 4: t.ó - Institut de Biologia Evolutivabiologiaevolutiva.org/jbertranpetit/wp-content/uploads/2014/11/JB... · expand the Islamic religion and the Arabic language into North ... the

298 z. Brakez et al

Table I. Nucleotide and sequence divergence in several populations. n, samplesize; k, number of different sequences; D, sequence diversity; pw, averagenumber of pairwise differences.

k D Sourcespw

506032852530266879

343829292023215853

0.941

0.963

0.988

0.942

0.973

0.975

0.985

0.992

0.976

4.

4.

6.

4.

5.

6.

7.

7.

8.

223224

234850

119203510752717

0.968

0.992

0.989

0.971

0.995

0.997

0.978

0.994

0.991

0.944

7.53

7.62

6.28

7.75

6.15

7.76

9.24

8.74

7.76

10.27

1839425419339

642423

64

4,7454

North AfricaSoussNC Moroccan BerberMoroccan ArabMozabiteSaharawiMauritanian

TuaregEgyptianNubian

Sub-Saharan AfricaSererWoloff

SenegaleseMandenkaluHausaYoruba

SonghaiSudaneseSomali

Pygmy

Europe and Middle EastBasque 0.936British 0.976Swiss 0.965Spanish 0.983Druze 0.952

Sources: 1, present study; 2, Rando et al. .1998; 3, C6rte- Real, Macaulay,Richards et al. 1996; 4, Watson et al. 1996; 5, Krings et al. 1999; 6, Gravenet al. 1995; 7, Vigilant et al. 1991; 8, Bertranpetit, Sala, Calafell et al. 1995; 9,Piercy, Sullivan, Benson et al. 1993; 10, Pult, Sajantila, Simanainen et al. 1994;11, Pinto, González, Hernández et al. 1996; 12, Macaulay et al. 1999.

2.954.453.755.025.22

106

100

72

89

45

5271437026

3,89

10

3,11

12

Rando et al. 1998, Macau1ay et al. 1999). If no match was found, we tried torecognize in HVR-I the hap1ogroup sequence motifs defined by Macau1ay et al.

(1999).Data from other North African, European and sub-Saharan African popu1ations

were used for comparison (see sample sizes and sources listed in table 1). Geneticdistanc.~s between populations were obtained by intrapopulation correction ofinterpopulation pairwise differences (intermatch-mismatch) using the equationD = dij -(dü + djj)/2 (Nei 1987), where dij is the mean nucleotide pairwise differencebetween populations i and j, djj is the mean nucleotide pairwise difference withinpopulation i and djj is the mean nucleotide pairwise difference within population j.The standard error of the distances was computed from 1000 bootstrap iterations(Efron 1982) by resampling nucleotide positions. A neighbour-joining tree was builtfrom the genetic distance matrix, as well as a principal coordinate graph (Gower1966).

60

53

21

:13

33

09

10

26

50

Page 5: t.ó - Institut de Biologia Evolutivabiologiaevolutiva.org/jbertranpetit/wp-content/uploads/2014/11/JB... · expand the Islamic religion and the Arabic language into North ... the

mtDNA sequences in the Souss Valley, Morocco 299

3. Results3.1. Sequence and population diversity

The complete sequence of a 360 bp segment of the control region of the mtDNAfrom position 16024 to 16383 according to Anderson et al. (1981) was determined for50 unrelated individuals from the Moroccan Souss area. A total of 34 differentsequences were found with 38 variable positions (figure 1). AII nucleotide changeswere transitions, except for a C to A transversion at po~ition 16114. At all sites, thenucleotide described in the reference sequence is the mdst common; a few sites showhigh levels ofpolymorphism, like position 16223 with a T in 11 sequences (32.4%),or position 16126 with a C in 8 sequences (23.5%).

Nine individuals (18%) shared the so-called Cambridge Reference Sequence(CRS; Anderson et al. 1981), while 28 individuals had a unique haplotype (56%).Three different sequences were found twice (6%), one three times, and another four.This sequence frequency spectrum can be summarized in the sequence diversityparameter, D = 0.961, which is similar to that found in other North African pOpU-

lations (table 1). Sequence diversity in North African populations ranges from 0.942to 0.992, with a median 0.976. This is an intermediate sequence diversity betweenEuropeans (range, 0.799-0.993, median 0.965, Simoni et al. 2000) and sub-SaharanAfricans (0.944-0.997, median 0.990, table 1).

When compared with other populations, 16 sequences (47.1%, borne by 60% ofthe individuals in the sample) were shared with Europeans, while eight other (23.5%,20% of the individuals) were shared with sub-Saharan Africans. Fifteen sequences(44.1%, 58% ofthe individuals) were also found in other North African popula-tions; of these, only one (sequence 8, figure 1) was not found in European or WestAsian populations. Seven sequences (20.6%), all ofthem found in a single copy, hadnot been described in other populations.

3.2. Sequence origin .We have assigned every control region sequence to a haplogroup by using the

sequence motifs defined by Rando et al. (1998) and by Macaulay et al. (1999), or, inthe ambiguous cases, by comparing the Souss sequences to other published datasetsin which both the control region sequences and the RFLP haplotypes were given.Twelve different haplogroups have been found in the Souss population: H (borne by32% ofthe individuals; figure 1), V (10%), J (10%), L2 (10%), L3a (10%), U5 (6%),U6 (6%), Llb (6%), T (4%), HV/JT (2%), K (2%) and U3 (2%). As discussedbelow, this high haplogroup diversity may be due to admixture.

The relationship among the sequences in the sample has been explored by meansof a median network (Bande1t et al. 1995, figure 2). Sequence 13 (CRS) was found inthe centre of the network and almost all sequences clustered together with thosebelonging to the same haplogroup. A neighbour-joining tree (Saitou and Nei 1987,not shown) yielded a very similar topology.

After haplogroup assignment one can infer the broad geographic origin of everysequence. Haplogroups Llb, L2 and L3a are likely to have a sub-Saharan Africanorigin since they are most frequent in that region and rare elsewhere (Watson et al.1997); haplogroup U6 may have originated in North Africa (Rando et al. 1998),and all other haplogroups found in the Souss population are found throughoutEurope, West Asia' and North Africa. Thus, the origins of the mtDNA sequencesin the Souss population can be summarized as 68% West Eurasian, 6% NorthAfrican and 26% sub-Saharan African. The same sequence Dartition can be assessed

Page 6: t.ó - Institut de Biologia Evolutivabiologiaevolutiva.org/jbertranpetit/wp-content/uploads/2014/11/JB... · expand the Islamic religion and the Arabic language into North ... the

300 z. Brakez et al.

11111111111111111111111111111111111111666666666666666666666666666666666666660001111111111122222222222222222333333346812224788899112245566779999990001466 F19644695247923393406704081234680491326 R

EACTCTTGGTCCTCCGACTACCCCCCCCACCTATATATC Q

HGCRS

c TC. T TTT. .G.;.'. ...C 3

141721

1

1

1

1

1

L2L2L2L2L231

1Ó2833219

c. ...... T. ...T. T. .

T. ..T. .

...C. T. .

1

1

1

1

1

L3aL3aL3aL3aL3a

G

C. G

C. .CT. .G

T. ..T. .

2I

U6U6

13151112302518

9112111

HHHHHHH

.T..A. ....A.

T..c. c.

,c. .

,c. .

41

vvG27 A

c c 120

T. ..CT. ..CT. ..CT. ..C

2111

JJJJ

TT. T. .T. G.G. ...

2923

c. .c. .

.T.TT. .

.T.TT.1

1A T

34 ..c. c. . 1 K

32" ,T G 1 U3

22 .T. .T. ..T. ..T.T. .. 1

1

1

USUSUS

.CT. .

..T. .

Figure I. Variable positions for 34 mitochondrial control region sequences (corresponding to 50 Soussindividuals) compared to the Cambridge Reference Sequence (CRS, Anderson et al. 1981). Only nucleo-tides differing from the reference are shown. Dots indicate identity with CRS. FREQ: number of indi-viduals bearing the same sequence in the Souss Va1ley sample. HG: Haplogroup inferred for each

sequence.

.T. .

, .T. ...

, .T. ...

c.c.c.

Page 7: t.ó - Institut de Biologia Evolutivabiologiaevolutiva.org/jbertranpetit/wp-content/uploads/2014/11/JB... · expand the Islamic religion and the Arabic language into North ... the

mtDNA sequences in the Souss Valley, Morocco 301

1 (L1b)

,

31 (L2)

i

6 (U5)

I

0---3 (U6)

~7(V)

Figure 2. Median network of Souss Valley sequences. Haplogroup affiliation of each sequence aregiven in parentheses. Circle areas are proportional to the frequency of each sequence.

Table 2. Frequencies (and their standard errors) of North Afrcan (06), sub-Saharan African (L) andWest Eurasian haplogroups in North African populations.

North African Sub-Saharan West Eurasiann (06) African (L) (all others)Population

Souss 50 6.0% (:!: 3.4%)NC Moroccan Berber 60 8.3% (:!:3.6%)Moroccan Arab 32 6.3% (:!:4.3%)Mozabite 85 31.8% (:!:5.1%)Saharawi 25 8.0% (:!: 5.4%)Mauritanian 30 20.0% (:!: 7.3%)Tuareg 26 7.7% (:!:5.2%)Egyptian 68 1.5% (:!: 1.3%)Nubian 79 0

26.0% (:!:6.2%)3.3% (:!: 3.6%)

21.9% (:!:7.3%)11.8% (:!:3.5%)44.0% (:!:9.9%)43.3% (:!:9.0%)84.6% (:!:7.1%)30.9% (:!: 5.0%)40.5% (:!: 5.5%)

68.0% (

88.4% (

71.8% (

56.4% (

48.0% (

36.7% (

7.7% (

67.6% (

59.5% (

NC, North Central

for other Northern African populations, by considering that L haplogroups have asub-Saharan African origin, U6 is autochthonous North African and all others butthe East Asian (A to F) haplogroups have a broad European-West Asian distri-bution. When the sequence origins of Souss Valley sequences are compared to thoseof other North African populations (table 2), it can be seen that the populationssouth of Souss contain larger amounts of sub-Saharan African sequences (Saharawi,44%; Mauritanian, 43%), while the populations north of Souss presentsmaller sub-Saharan fractions (North Central Moroccan Berber, 3%), with the exception of

:!:6.6%):!:4.1%):!:8.0%):!:5.4%):!: 10.0%)

:!:8.8%):!:5.2%):!:5.1%):!:5.5%)

Page 8: t.ó - Institut de Biologia Evolutivabiologiaevolutiva.org/jbertranpetit/wp-content/uploads/2014/11/JB... · expand the Islamic religion and the Arabic language into North ... the

302 z. Brakez et al

Moroccan Arabs (22%). This south-north cline of sub-Saharan African sequencesseems to have a parallel in North-East Africa (Nubian, 41 %; Egyptian, 31%), asdescribed by Krings et al. (1999). The geographical distribution ofthe North AfricanU6 haplogroup does not seem to present a clear pattern, with low frequencies (1-9%) in most populations, except for Mauritanians (20%) and Mozabites (32%).

3.3. Pairwise difference distribution .

The pairwise difference distribution of the Souss mtDNA sequences (figure 3)shows a major mode at five differences and a smaller peak at nine differences,with an average 4.60 differences. However, this distribution may be a compoundof at least two different distributions. In fact, when pairs of sub-Saharan sequencesfound in Souss are compared, a different distribution emerges, with a major peak atnine differences, a smaller peak at two differences, and an average 5.95 differences. Infact, the peak at nine differences of the total distribution is made up disproportio-nately by sub-Saharan-sub-Saharan sequence pairs (6.4% of total pairs, 23.1% ofthose with nine differences) and by West Eurasian-sub-Saharan pairs (54.5% oftotalpairs, 85.4% of those with nine differences). By contrast, the pairwise differencedistribution of West Eurasian sequences in Souss appears shifted to the left, witha single peak at two differences and an average 3.23 differences. This different behav-iour in the pairwise difference distribution may be a consequence of the different ageof African and West Eurasian populations. It is expected that populations with olderexpansions present pairwise difference distributions with peaks at higher values thanthose in populations with more recent expansions (Rogers and Harpending 1992).However, within North African populations, the average pairwise differences arestrongly conditioned by the amount of sub-Saharan admixture. In fact, the averagepairwise difference and the frequency of L haplogroups are highly (though notstatistically significantly) correlated in No.rth African populations (Pearson'sr = 0.553, p = 0.122).

0,25

0,2

0,15

o, 1

0,05

oo 1 2 3 4 5 6 7 8 9 10 11 12

Figure 3. Pairwise difference distribution among Souss Valley sequences. Bold line, total distribution.Thin line, differences among pairs of sequences presumed to have a sub-Saharan origin. Broken line,differences among pairs of sequences presumed to have a West Eurasian origin.

Page 9: t.ó - Institut de Biologia Evolutivabiologiaevolutiva.org/jbertranpetit/wp-content/uploads/2014/11/JB... · expand the Islamic religion and the Arabic language into North ... the

mtDNA sequences in the Souss Valle Morocco 303

3.4. Population relationshipsThe intermatch-mismatch genetic distance was computed among North African,

European and sub-Saharan African populations, and the distance matrix was repre-sented as a neighbour-joining tree (Saitou and Nei 1987). Since the neighbour-join-ing tree showed little structure beyond the separation of Europeans versus sub-Saharan Africans, and did not conform to a series of successive splits (as wouldbe expected of an evolutionary tree) we preferred to display the distance matrix

through principal coordinate analysis (figure 4). The first~wo principal coordinatesexplain 8l.9%ofthe variance in the distance matrix and separate first Pygmies fromall other populations. The remaining populations are contained in a cluster withEuropeans at one end and sub-Saharan Africans at the other end. North Africanpopulations are located roughly according to their amount of sub-Saharan admix-ture and can be found next to Europeans (North Central Moroccan Berbers) or tosub-Saharan Africans (Nubian, Tuareg), although most are found in an intermediateposition. The Souss population lies closer to Europeans. Most mtDNA geneticanalyses show a clear separation between Europeans and Africans. However ,when North African populations are included, this population distinction (whichis always clear when considering individual sequences) is blurred given the sub-Saharan admixture in North Africa.

4. DiscussionWe have sequenced the mtDNA HVR-I in 50 individuals from the Souss Valley

in Morocco and we have been able to assign a broad geographical origin toeach sequence. In particular, haplogroup classification of HVR-I sequences split

.,mc

'65o

umc.

'¡;,!:~-oc

N

1st Principal Coordinate

Figure 4. First and second principal coordinates estimated from the intermatch-mismatch genetic dis-tance among North African, sub-Saharan African and European populations. The two first principalcoordinates explain 81.9% of the variance in the distance matrix.

Page 10: t.ó - Institut de Biologia Evolutivabiologiaevolutiva.org/jbertranpetit/wp-content/uploads/2014/11/JB... · expand the Islamic religion and the Arabic language into North ... the

304 z. Brakez et al

those sequences into West Eurasian, North African specific and sub-SaharanAfrican. Thus, the Souss Valley mtDNA pool can be subdivided according togeography. Given its high frequencies in North Africa and its rarity elsewhere,it can be assumed (Rando et al. 1998) that haplogroup U6 originated in NorthAfrica, but variation within other haplogroups may have also been generatedin situ. This admixture analysis informs on the geographical origin ofmtDNA sequences, but the admixture fractions should be regarded as theresult of the total amount of gene flow over hist0¡;y. The tempo of gene flow iselusive: it could have occurred in a single generafion or as a trickle over a long

period.West Eurasian sequences in the Souss Valley form the core of the mtDNA pool

and, as in North Africa in general, seem to be part of the homogeneous WestEurasian continuum (Simoni et al. 2000), while sub-Saharan sequences in theSouss Valley seem to have a West African origin (Rando et al. 1998) and show arough south-north cline, similar to that found in a principal component analysis ofclassical polymorphism allele frequencies (Bosch, Calafell, Pérez-Lezaun et al. 1997).Sub-Saharan admixture in North-West Africa was also detected in y chromosomepolymorphisms (Bosch, Calafell, Santos et al. 1999). This broad picture is compa-tible with North African populations sharing an origin with West Eurasians andhaving incorporated sub-Saharan African sequences via trans-Saharan gene flow.The clinal pattern of sub-Saharan mtDNA sequences in North Africa seems to pointto a recent, maybe ongoing timeframe for gene flow into North Africa. The mainexception to the clinal pattern is the relatively high frequency of sub-Saharansequences in a northern, urban sample of Moroccan Arabs, which may be explainedby the capacity of cities to attract migrants.

The amount of sub-Saharan admixture over a West Eurasian background seemsto determine the position of the Souss Valley population when compared to otherEuropean and African populations. Previous knowledge of the geographical originof the sequences allows to interpret the inrermediate position of North Africanpopulations in a principal coordinate graph as the result of admixture and allowsto rule out other possible explanations for population relationships.

The analysis of a sample of mtDNA sequences by means of the distribution oftheir pairwise differences (also called a mismatch distribution) has been a fruitfulapproach in human population genetics (Harpending, Sherry, Rogers et al. 1993,Watson et al. 1996, among many others). Rogers and Harpending (1992) modelledthe effects of a population expansion on the mismatch distribution and found that itcreated a bell-shaped distribution with a peak that travelled to the right with time.Thus, if a population exhibited a bell-shaped mismatch distribution, it was assumedthat it had undergone an expansion in the past, and, given a mutation rate estimate,it was possible to estimate the expansion time from the mismatch distribution.Howe~er, it was also shown that other factors could also create a bell-shaped mis-match distribution, such as the heterogeneity in mutation rates across nucleotidesites (Aris-Brosou and Excoffier 1996). The mismatch distribution of the Sousspopulation shows that admixture from populations with different ages can alsoaffect the mismatch distribution; in this case, given that the expansion age differenceis large enough, it appears as two distinct peaks in the mismatch distribution.Mechanical estimation of an expansion age from such a distribution would lead toa meaningless value.

Page 11: t.ó - Institut de Biologia Evolutivabiologiaevolutiva.org/jbertranpetit/wp-content/uploads/2014/11/JB... · expand the Islamic religion and the Arabic language into North ... the

mtDNA sequences in the Souss Va/ley, Morocco 305

AcknowledgementsWe would like to express our gratitude to Ibnou Zohr University, Agadir

(Morocco) and to the Universitat de Barcelona (Spain) for the help and cooperationthey provided for this project. We are indebted to Dr M. Bouhdoud and his labora-tory staff for their constant support. We are also indebted to the personnel of thetransfusion centre of Hassan II Hospital, Agadir for supplying us with the bloodsamples. We would like to thank Mr A. Zoubair for his invaluable help with proof-reading. This work was possible thanks to grants PB95-:Q:267-CO2-01 and PB98-1064(CICYT, Spain) to JB. .

References

ANDERSON, S., BANKIER, A. T., BARRELL, B. G., DE BRUIJN, M. H., COULSON, A. R., SANGER, F.,SCHREIER, P. H., SMITH, A. J. H., STADEN, R., and YOUNG, G., 1981, Sequence and organizationof the human mitochondria1 genome. Nature, 290, 457-465.

ARIS-BROSOU, S., and EXCOFFIER, L., 1996, The impact of popu1ation expansion and mutation ratehetérogeneity on DNA sequence polymorphism. Molecular Biology and Evolution, 13, 494-504.

BANDELT, H. J., FORSTER, P., SYKES, B., and RICHARDS, M. B., 1995, Mitochondrial portraits ofhumanpopulations using median networks. Genetics, 141,743-753.

BARBUJANI, G., PILASTRO, A., DE DOMENICO, S., and RENFREW, C., 1994, Genetic variation in NorthAfrica and Eurasia: Neolithic demic diffusion vs. Paleolithic colonisation. American Journal ofPhysical Anthropology, 95, 137-154.

BERTRANPETIT, J., SALA, J., CALAFELL, F., UNDERHILL, P., MORAL, P., and COMAS, D., 1995, Humanmitochondrial DNA variation and the origin ofthe Basques. Annals of Human Genetics, 59,63-81.

BOSCH, E., CALAFELL, F., PÉREZ-LEZAUN, A., COMAS, D., MATEU, E., and BERTRANPETIT, J., 1997, Apopulation history ofNorthern Africa: evidence from classical genetic markers. Human Biology, 69,295-311.

BOSCH, E., CALAFELL, F., SANTOS, F. R., PÉREZ-LEZAUN, A., COMAS, D., BENCHEMSI, N., TYLER-SMITH, C., and BERTRANPETIT, J., 1999, Variation in short tandem repeats is deeply structuredby genetic background on the human Y chromosome. American Journal of Human Genetics, 65,1623-1638.

BRIGNON, J., AMINE, A., BOUTALEB, B., MARTINET, G., and ROSENBERGER, B., 1967, Histoire du Maroc(Rabat: Hatier).

CABOT, É. L., 1988, ESEE: the Eyeball Sequence Editor, version 1.06 (Burnaby: University of British

Columbia). .CAVALLI-SFORZA, L. L., MENOZZI, P., and PIAZZA, A., 1994, History and Geography of Human Genes

(Princeton, NJ: Princeton University Press).CHEN, Y. S., TORRONI, A., EXCOFFIER, L., SANTACHIARA-BENERECETTI, A. S., and WALLACE, D. C.,

1995, Analysis of mtDNA variation in African populations reveals the most ancient of allhuman continent-specific haplogroups. American Journal of Human Genetics, 57, 133-149.

COMAS, D., CALAFELL, F., MATEU, E., PÉREZ-LEZAUN, A., BOSCH, E., and BERTRANPETIT, J., 1997,Mitochondrial DNA variation and the origin of Europeans. Human Genetics, 99, 443-449.

CÓRTE-REAL, H. B. S. M., MACAULAY, V., RICHARDS, M. B., HARITI, G., ISSAD, M. S., CAMBON-THOMSEN, A., PAPIHA, S. S., BERTRANPETIT, J., and SYKES, B., 1996, Genetic diversity in theIberian Peninsula determined from mitochondrial sequence analysis. Annals of Human Genetics,60, 331-350.

EFRON, B., 1982, The Jackknife, the Bootstrap, and other Resampling Plans (Philadelphia, PA: Society forIndustrial and Applied Mathematics).

EXCOFFIER, L., PELLEGRINI, B., SÁNCHEZ-MAZAS, A., SIMON, C., and LANGANEY, A., 1987, Genetics andhistory of Sub-Saharan Africa. Yearbook of Physical Anthropology, 30, 151-194.

FELSENSTEIN, J., 1989, PHYLIP-Phylogeny Inference Package (Version 3.2). Cladistics, 5, 164-166.GAID, M., 1990, Les berbers dans l'histoire. Tome 1: de la préhistoire d la Khahina (Algiers: Editions

Mimo'uni).GOWER, J. C., 1966, Some distance properties of latent root and vector methods in multivariate analysis.

Biometrika, 27, 857-874.GRAVEN, L., PASSARINO, G., SEMINO, O., BOURSOT, P., SANTACHIARA-BENERECETTI, S., LANGANEY, A.,

and EXCOFFIER, L., 1995, Evolutionary correlation between control region sequences and restric-tion polymorphism in the mitochondrial genome of a large Senegalese Mandenka sample.Molecular Biology and Evolution, 12,334-345.

GRIMBERG, J. S., NAWOSCHIK, L., BELLUSCIO, R., McKEE, A., TUCK, A., and EISEMBERG, A., 1989, Asimple and efficient non-organic procedure for the isolation of genomic DNA from blood. NucleicAcids Research, 17,8390.

Page 12: t.ó - Institut de Biologia Evolutivabiologiaevolutiva.org/jbertranpetit/wp-content/uploads/2014/11/JB... · expand the Islamic religion and the Arabic language into North ... the

z. Brakez et al.306

HARPENDING, H. C., SHERRY, S. T., ROGERS, A. R., and STONEKING, M., 1993, The genetic structure ofancient human popu1ations. Current Anthropology, 34, 483--496.

IZAABEL, H., GARCHON, H. J., CAILLAT-ZUCMAN, S., BEAURAIN, G., AKHAYAT, O., BACH, J. F., andSÁNCHEZ-MAZAS, A., 1998, HLA class II DNA po1ymorphism in a Moroccan population fromSouss, Agadir area. Tissue Antigens, 51, 106-110.

JULIEN, C.-A., 1961, Histoire de l'Afrique du Nord (Paris: Payot).KRINGS,M., SALEM, A.-H., BAUER, K., GEISERT, H., MALEK, A. K., CHAIX, L., SIMON, C., WELSBY, D.,

DI RIENZO, A., UTERMANN, G., SAJANTILA, A., PAABO, S., and STONEKING, M., 1999, mtDNAana1ysis of Ni1e River Va11ey popu1ations: a genetic corridor or a barrier to migration? AmericanJournal of Human Genetics, 64, 1166-1176.

MACAULAY, V., RICHARDS, M., HICKEY, E., VEGA, E., CRUCIANI, F~ GUIDA, V., SCOZZARI, R., BONNÉ-TAMIR, B., SYKES, B., and TORRONI, A., 1999, The emerging tree ef West Eurasian mtDNAs: asynthesis of contro1-region sequences and RFLPs. American Journal of Human Genetics, 64, 232-249.

MATEU, E., COMAS, D., CALAFELL, F., PÉREZ-LEZAUN, A., and BERTRANPETIT, J., 1997, A tale of twoislands: popu1ation history and and mitochondrial DNA sequences of Bioko and Siio Tomé, Gulfof Guinea. Annals of Human Genetics, 61, 507-518.

MURDOCK, G. P., 1959, Africa, its Peoples and their Culture History (London, New York: McGraw-Hi11).NEI, M., 1987, Molecular Evolutionary Genetics (New York, NY: Columbia University Press).NEWMAN, J., 1995, The Peopling of Africa: a Geographic Interpretation (New Haven, CT: Yale University

Press).PIERCY, R., SULLIV AN, K. M., BENSON, N ., and GILL, P ., 1993, The application of mitochondrial DNA

typing to the study of white Caucasian genetic identification. International Journal of LegalMedicine, 106, 85-90.

PINTO, F., GONZÁLEZ, A. M., HERNÁNDEZ, M., LARRUGA, J. M., and CABRERA, V. M., 1996, Geneticrelationship between the Canary Is1anders and their African and Spanish ancestors inferred frommitochondrial DNA sequences. Annals of Human Genetics, 60,321-330.

PULT, I., SAJANTILA, A., SIMANAINEN, J., GEORGIEV, 0., SCHAFFNER, W., and PAABO, S., 1994,Mitochondria1 DNA sequences from Switzerland revea1 striking homogeneity of European popu-lations. Biological Chemistry Hoppe-Seyler, 375,837-840.

RANDO, J. C., PINTO, F., GONZÁLEZ, A. M., HERNÁNDEZ, M., LARRUGA, J. M., CABRERA, V. M., andBANDELT, H.-J., 1998, Mitochondrial DNA analysis in Northwestern African populations revealsgenetic exchanges with European, near-Eastern, and sub-Saharan populations. Annals of HumanGenetics, 62, 531-550.

RENFREW, C., 1991, Before Babel: speculations on the origins of linguistic diversity. CambridgeArchaeological Journal, 1, 3-23.

ROGERS, A. R., and HARPENDING, H., 1992, Population growth makes waves in the distribution ofpairwise genetic differences. Molecular Biology and Evolution, 9, 552-569.

RUHLEN, M., 1987, A Guide to the World's Languages (Stañford: Stanford University Press).SAITOU, N., and NEI, M., 1987, The neighbor-joining method: a new method for reconstructing phylo-

genetic trees. Molecular Biology and Evolution, 4,406--425.SCHNEIDER, S., KUEFFER, J.-M., and EXCOFFIER, L., 1997, Arlequin ver 1.1: a Softwarefor Population

Genetic Data Analysis (Geneva: Genetics and Biometry Laboratory, University of Geneva).SIMONI, L., CALAFELL, F., PETTENER, D., BERTRANPETIT, J., and BARBUJANI, G., 2000, Geographic

patterns of DNA diversity in Europe. American Journal of Human Genetics, 66, 262-278.TAMURA, A., and NEI, M., 1993, Estimation of the number of nucleotide substitutions in the control

region of mitochondria1 DNA in humans and chimpanzees. Molecular Biology and Evolution, 10,512-523.

TORRONI, A., HUOPONEN, K., FRANCALACCI, P., PETROZZI, M., MORELLI, L., SCOZZARI, R., OBINU, D.,SAVONTAUS, M.-L., and WALLACE, D. C., 1996, Classification of European mtDNAs from anana1ysis of three European popu1ations. Genetics, 144, 1835-1850.

TORRONI, A., SCHURR, T. G., YANG, C. C., SZATHMARY, E. J. E., WILLIAMS, R. C., SCHANFIELD, M. S.,TROUP, G. A., KNOWLER, W. C., LAWRENCE, D. N., WEISS, K. M., and WALLACE, D. C., 1992,Native American mitochondria1 DNA analysis indicates that the Amerind and the Nadene popula-ti0\1s were founded by two independent migrations. Genetics, 130, 153-162.

VIGILANT, L.,STONEKING, M., HARPENDING, H., HAWKES, K., and WILSON, A. C., 1991, African popu1a-1ions and the evo1ution ofmitochondria1 DNA. Science, 253,1503-1507.

WATSON,E.,BAUER, K.,AMAN,R., WEISS,G., VONHAESELER,A., and PAABO,S., 1996, mtDNA sequencediversity in Africa. American Journal of Human Genetics, 59, 437--444.

WATSON, E., FORSTER, P., RICHARDS, M., and BANDELT, H. J., 1997, Mitochondria1 footprints ofhumanexpansionsin Africa. American Journal of Human Genetics, 61,691-704.

Address for correspondence: Francesc Calafell, Unitat de Biologia Evolutiva, Facultat de Ciencies dela Salut i de la Vida, Universitat Pompeu Fabra, Doctor Aiguader 80, 08003 Barcelona (Catalonia), Spain.

email:francesc.calafell@cexs. upf.es

Page 13: t.ó - Institut de Biologia Evolutivabiologiaevolutiva.org/jbertranpetit/wp-content/uploads/2014/11/JB... · expand the Islamic religion and the Arabic language into North ... the

mtDNA sequences in the Souss Valley, Morocco 307

Zusammenfassung. Hintergrund: Verschiedene Bev61kerungen haben zum heutigen Genpool vonMarokko beigetragen, einschlieBlich der autochthonen Berber, der Phoenizier, der sephardischen Juden,der Beduinen und der Sub-Sahara Afrikaner.Ziel: Das P!imiirziel der Studie war es, anhand einer Sequenzanalyse der mitochondrialen DNA (mtDNA)eine genetische Beschreibung der Berber-sprechenden Bev61kerung in der Souss Region von Südmarokkodurchzuführen.Material und Methoden: Das erste hypervariable Segment der mtDNA-Kontro1lregion wurde in einerStichprobe von 50 Individuen vom Souss Tal sequeziert. Die Ergebnisse wurden mit den umfangreichenDaten, die für die Sequenzvielfalt der mtDNA in Europa und Sub-Sahara Afrika vorhanden sind, vergli-chen.Ergebnisse: 34 unterschiedliche Sequenzen wurden gefunden; schiitzuiI~sweise 68% der Sequenzen kom-men in Europa, Westasien und Nordafrika vor, 26% stammen aus Sub-Sahara Afrika, und 6%, geh6renzu der spezifischen Haplogruppe U6 von Nordafrika. Die Sequenz der mtDNA im Souss Tal weist auf dieAnwesenheit von zwei Populationen hin, die sich zu unterschiedlichen Zeitpunkten dort angesiedelt haben:die Sequenz der Westeurasier hat in der Souss Stichprobe eine durchschnittlich kleinere Anzahl vonpaarweisen Unterschieden als die Paare der Sub-Sahara Sequenz.Zusammenfassung: Detai1lierte Kenntnisse des m6glichen geographischen Ursprungs jeder Sequenzerleichtern die Interpretation sowohl der internen Verschiedenheiten als auch der Beziehungen zwischenden Popul~tionen. Die Sub-Sahara Beimischung im Souss Tal passt zu dem Süd- Nord-Cline des Sub-Sahara Einflusses in Nordafrika. Dieser ist offensichtlich auch in den genetischen Abstiinden der norda-frikanischen Populationen zu Europiiern und zu den Sub-Sahara Afrikanern vorhanden.

Résumé. Arriere-plan: Diverses populations ont contribué au patrimoine génétique actuel du Maroc, despopulations berberes autochtones, aux phéniciens, aux juifs séfarade, aux arabes bédouins et aux africainssub-sahariens.Object!f réaliser une description génétique de la population berbérophone de la vallée du Souss a partir deI'analyse séquentielle de l' ADN mitochondrial.Sujets et méthodes: le premier segment hypervariable de la région de controle de l' ADNmt a été séquencédans un échantillon de 50 individus de la vallée du Souss et les résultats ont été comparés aux nombreusesdonnées sur la variation de séquence de l' ADNmt, en Europe et en Afrique sub-saharienne.Résultats: trente quatre séquences différentes ont été trouvées, dont environ 68% se rencontrent en Eur-ope, en Asie occidentale et en Afrique du Nord, 26% ont leur origine en Afrique sub-saharienne et 6%appartiennent a I'haplogroupe U6 spécifique de l' Afrique du Nord. Les séquences de l' ADNmt de la valleedu Souss indiquent la présence de deux populations qui se sont développées a des époques différentes: lesséquences Ouest-eurasiennes de I'échantillon du Souss ont un plus petit nombre moyen de différencesappariéesque les paires de séquences sub-sahariennes.Conclusion: la connaissance détaillée de I'origine géographique possible de chaque séquence a facilitéI'interprétation a la fois des parametres de la diversité intern~ ainsi que des relations entre populations.La composante sub-saharienne dans la vallée du Souss est en accord avec le gradient sud nord deI'influence sub-saharienne en Afrique du Nord, également évident dans les distances génétiques des pOpU-lations nord-africaines par rapport aux populations européennes et africaines sub-sahariennes.