29
Master thesis in Ecology, 60 ECTS Vt 2019 Genetic diversity and hardiness in Scots pine from Scandinavia to Russia Jenny Olsson

Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

Master thesis in Ecology, 60 ECTS Vt 2019

Genetic diversity and hardiness in Scots pine from

Scandinavia to Russia

Jenny Olsson

Page 2: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern
Page 3: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

Abstract The postglacial recolonization of northern Europe supposedly originated from Western Europe and the Russian Plain, however, recent molecular and macrofossil-based investigations suggest that the history may be more complex than previously thought. This study aims to investigate the genetic diversity and population structure of Scots pine from Scandinavia to Russia to re-evaluate its recolonization history, and to examine whether the pattern of spatial genetic diversity has any adaptive significance. Populations ranging from Norway to Russia were sampled and genotyped using genotyping-by-sequencing. The seedlings were freeze tested to provide an average degree of hardiness for every population. Eight hundred and thirty-two seedlings were analyzed, and 6,034 SNPs were recovered in these individuals after stringent filtering. Population structure was investigated using fastStructure and differentiation between populations was estimated with pairwise FST and analysis of molecular variance (AMOVA) to assess the genetic variability. Genetic diversity was measured as observed heterozygosity, H0, in populations, clusters and overall. Two genetic clusters were detected in the samples, one in Norway and Sweden and one in Russia. These clusters are weakly differentiated (FST = 0.01202) with only 0.66 % variation between them. Highest variation was found within populations (98.8 %) and the overall genetic diversity for all populations was high (Ho = 0.2573). The weak differentiation and high diversity are indicative of extensive gene flow between populations in this species. The composition of the clusters across the sampled area suggests a westward recolonization from the Russian Plain into Scandinavia, and a possible local origin of another polymorphism in Norway and Sweden. No clear relationship between cold hardiness and genetic variation was detected. The clinal variation in cold hardiness reflects local adaptation, and the difference between genetic and phenotypic variation is likely due to epigenetic regulation or polygenic inheritance. More extensive genome scan is needed to understand the genetic basis of local adaptation. Key words: Pinus sylvestris, recolonization history, genetic diversity, cold hardiness, clinal variation

Page 4: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

Table of Contents 1 Introduction .......................................................................................................................... 1

1.1 Present distribution and ecology of Scots pine .................................................................. 1

1.2 Last glacial maximum and postglacial recolonization ....................................................... 1

1.3 Genetic diversity................................................................................................................ 2

1.4 Phenotypic diversity .......................................................................................................... 3

1.5 Genotyping-by-sequencing ............................................................................................... 4

1.6 Aim .................................................................................................................................... 5

2 Materials and methods ..................................................................................................... 5

2.1 Sampling and freeze testing .............................................................................................. 5

2.2 DNA isolation ................................................................................................................... 6

2.3 GBS library preparations .................................................................................................. 6

2.4 Bioinformatics .................................................................................................................. 7

2.5 SNP filtering ..................................................................................................................... 7

2.6 Structure analysis and population differentiation ........................................................... 7

2.7 Genetic diversity ............................................................................................................... 8

3 Results ................................................................................................................................... 8

3.1 Sequencing data quality and GBS results ......................................................................... 8

3.2 Structure analysis ............................................................................................................. 8

3.3 Population differentiation .............................................................................................. 10

3.4 Genetic diversity in Scots pine ........................................................................................ 11

4 Discussion ........................................................................................................................... 11

4.1 Genotyping-by-sequencing .............................................................................................. 11

4.2 Low population structure and high genetic diversity in Scots pine ................................12

4.3 Glacial refugium and postglacial migration .................................................................... 13

4.4 Phenotypic variation is not coupled with genetic variation ............................................14

4.5 Conclusions ......................................................................................................................16

Acknowledgements .............................................................................................................. 17

References ............................................................................................................................. 18

Appendix ................................................................................................................................ 23

Appendix 1 ............................................................................................................................ 23

Page 5: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

1

1 Introduction

1.1 Present distribution and ecology of Scots pine Scots pine (Pinus sylvestris L.) is one of the most widely distributed conifers in the northern hemisphere. The species ranges longitudinally from Western Europe (~7°W) to central-eastern Asia (~138°E), and latitudinally from Spain (~37°N) to northern Norway (~70°N) (Carlisle and Brown 1968; Eckenwalder 2009). It has an altitudinal range from sea level to around 2,400 m above sea level where the highest elevations occur in the southern distributions. This widespread occurrence (longitude, latitude and altitude) demonstrates an ability to tolerate different climatic conditions (Carlisle and Brown 1968). Scots pine has broad habitat amplitudes but occurs mostly on dry or mesic soils with a podzol profile. The soil is nutrient-deficient and has a low pH with a ground vegetation consisting mostly of bilberry, lichens and heather. It is a pioneer species with high light requirements for establishment and therefore thrives in areas frequently subjected to fire disturbances (Esseen et al. 1997; Fries et al. 1997, Øyen et al. 2006; Eckenwalder 2009). Scots pine usually form extensive pure forests or mixed stands with birch and other conifers, but in Siberia, south of the boreal forest, the species is only confined to small, isolated groves (Eckenwalder 2009). Boreal forests in Scandinavia have a high economic and ecological value. They have been utilized for different purposes for a very long time, but the most intense utilization has occurred during the last century alongside the development of forest industries (Östlund 2004). The forest industry accounts for 10 % of Sweden’s total export, which makes Sweden one of the leading export countries of wood, pulp and paper products (Skogsindustrierna 2017). Scots pine stands make up 39 % of the productive forests in Sweden (The Swedish National Forest Inventory 2018). With this high economic value and the intensively managed production forest, the ecological implications of forestry practices on genetic diversity have become a major concern since intense forestry often involves cultivated plants or trees which could lead to a loss of genetic diversity. Due to intensive management for production, many plant and animal populations have decreased in abundance, which has had a huge impact on the overall biodiversity in Scandinavian boreal forests (Östlund 2004). 1.2 Last glacial maximum and postglacial recolonization Knowledge about the history of Scots pine, and its recolonization routes after the last glacial maximum (LGM), is important for understanding its current distribution. LGM occurred around 20,000 years before present (B.P.), and the ice sheet covered most of the area where Scots pine is currently distributed in northern Europe (Svendsen et al. 1999). The ice started to retreat due to increased northern summer insolation between 20,000 to 19,000 years ago (Clark et al. 2009). The effects of ice ages on European plant species have been different depending on whether they are cold tolerant or not. During the Quaternary period, many species went through range contractions when the temperature decreased, resulting in extinctions of northern populations, and range expansions with rapid colonization when the temperature started to rise again. These rapid recolonizations resulted in successive bottlenecks that may have led to loss of genetic diversity among populations in northern Europe (Hewitt 1996). However, each species has responded independently to colder periods, which gives them their own unique history of recolonization patterns (Taberlet et al. 1998). During the LGM, pine species in Europe were present only in fragmented populations and mostly in ice-free areas called refugia (Bennett, Tzedakis and Willis 1991; Matías and Jump 2012). Most refugia were located in the southern parts of Europe, in Italy and the Iberian Peninsula (Cheddadi et al. 2006; Labra et al. 2006). However, some studies consider refugial areas also in the northern parts of Europe since several tree species arrived to the north remarkably soon after the deglaciation period. Kullman (2008) reviewed evidence of megafossil remains in Sweden and Norway to determine whether there might have been refugia closer to the ice sheet. According to these megafossils, the oldest Scots pine appeared in southern Scandes around 11,700 yr. B.P., and he concluded that there might have existed “cryptic” refugia closer to the edge of the ice at its full extension. Parducci et al. (2012) further

Page 6: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

2

investigated the possibility of refugial areas in Scandinavia by combining traditional paleoecological methods with modern and ancient DNA analyses. Their most remarkable result was the presence of pine and spruce DNA in lake sediments close to Andøya, northern Norway, dating to around 22,000 and 17,700 cal. yr. B.P., respectively. These results, in combination with previous macrofossil analyses, suggest that coniferous trees were present in Scandinavia during the last glaciation. In combination with the glacial refugia in Scandinavia, some studies also consider refugial areas eastward of Moscow (Pyhäjärvi, Salmela and Savolainen 2008; Buchovska et al. 2013), which is in conflict with the traditional hypothesis that boreal trees were only present in southern Europe during the last glaciation period. When the ice started to retreat and the climate became warmer, both animals and plants started to recolonize northern Europe. Tóth et al. (2017) reviewed studies of different molecular markers to determine the evolutionary history and phylogeographic pattern of Scots pine in Europe. The authors conclude that several studies confirm a recolonization history that supposedly originated from Mediterranean areas. However, some studies also suggest an origin from eastern Alps and the surroundings of the Danube plain in Bulgaria. These migrations occurred somewhere between 14,000 and 8,000 B.P. Taberlet et al. (1998) conducted a comparative study based on phylogeography between 10 species and found that Scandinavia was probably colonized from two main directions, both from Western Europe and from the Russian Plain. Scots pine was not included in their study, but Sinclair, Morman and Ennos (1999) investigated the postglacial history of Scots pine in Western Europe by studying mitochondrial DNA (mtDNA). Their results show that two different mitochondrial haplotypes (mitotypes) were present in Scandinavia. Mitotype a was fixed in southern Sweden whereas mitotype b was fixed in Norway, Finland and northern Sweden. The authors conclude that this is consistent with dual migration of genetically differentiated populations both from the south and from the northeast. This hypothesis is strongly supported by the results from Dering et al. (2017) where similar mitotype distributions and recolonization routes were found in Scots pine in Scandinavia. These refugia and the recolonization histories during Holocene have had a major role in determining the current natural distribution limits of Scots pine. They have also had a strong influence on the genetic diversity and adaptability of this species in newly colonized areas (Matías and Jump 2012). Populations with important roles in the recolonization of northern Europe might have cold tolerant genotypes whereas populations in southern parts of Europe maintained genotypes suited to less moisture and warmer climates (Tóth et al. 2017). However, few studies have considered investigating both genetic and phenotypic variations in the same set of populations while trying to answer these questions and therefore, little is known about the distributions of both genetic and phenotypic variations over space.

1.3 Genetic diversity Organisms’ abilities to adapt to changing environments are supported by genetic diversity. Studying genetic diversity has therefore become important to understand how species have been able to adapt to environmental fluctuations in the past (Logan et al. 2019; Rizvanovic et al. 2019; Varsamis et al. 2019). These studies also provide information about how different species will react to the ongoing climate change. Genetic diversity can be determined based on morphological, biochemical and molecular information. Since the genetic basis of morphological traits in trees is mostly unknown, and because these traits often depend on environmental context and can also depend on the expression of many unlinked genes, they show strong limitations in studies of genetic diversity (Schulman 2007). Due to these limitations, morphological markers have been largely pushed aside by biochemical markers, e.g. allozymes (Wu, Krutovskii and Strauss 1999). However, biochemical markers also have limitations, e.g. environmental effects on expression patterns, and only a small portion of genetic variation is reflected at allozyme level. DNA molecular markers, on the other hand, provide direct information about genetic differences on a more detailed level and without

Page 7: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

3

interferences of environmental factors compared to morphological and biochemical markers (Schulman 2007; Abdel-Mawgood 2012; Holliday, Hallerman and Haak 2018). Molecular markers that have been utilized to study genetic diversity are for example RFLP (restriction fragment length polymorphism) (Botstein et al. 1980), SSR (simple sequence repeat, also called microsatellites) (Tautz 1989), AFLP (amplified fragment length polymorphism) (Vos et al. 1995) and SNPs (single nucleotide polymorphisms) (Chen and Sullivan 2003). SSR markers are one of the most popular markers in animal and plant genetic investigations. They consist of short nucleotide sequences (1 – 6 bp in length) which are repeated and spread across the eukaryotic genome in a random fashion. They are also highly polymorphic due to high mutation rates that affect the number of repeats (Abdel-Mawgood 2012). SSR markers are widely used when studying genetic diversity, often in agreement with chloroplast DNA (cpDNA) markers (Morand et al. 2002; Bogdan et al. 2018). However, SSRs are quickly being replaced by SNPs as the preferred DNA marker due to the advance of high throughput sequencing technology. SNP markers are more abundant, stable and increasingly cost-effective for genomic studies. SNPs represent sites where the DNA sequence differs by only one base when individuals are compared to each other. These differences might represent neutral variation that is useful when assessing genetic diversity in connection with evolutionary history (Abdel-Mawgood 2012). Investigations of genetic diversity in Scots pine using these different markers reveal somewhat similar results. Whole-genome studies using nuclear markers show high variation throughout the distribution range (Pyhäjärvi et al. 2007; García Gil et al. 2015). A high genetic diversity is also found in studies of allozyme differentiation among populations in Sweden and China (Gullberg et al. 1985; Szmidt and Muona 1985; Wang, Szmidt and Lindgren 1991). The variation is high within populations whereas the differentiation between populations is low. This is also seen in the paternally inherited chloroplast genome (Dering et al. 2017). However, studies of the mitochondrial genome, which is maternally inherited, reveal results with slightly lower genetic diversity and higher population differentiation in contrast to the results found in whole-genome studies (Pyhäjärvi, Salmela and Savolainen 2008; García Gil et al. 2015¸ Dering et al. 2017). MtDNA is considered more informative when investigating recent colonization history since it can persist for a longer time due to its maternal inheritance in pine, which implies transmission via seeds. Nuclear markers, however, are useful for determining the current state of differentiation by enabling comparisons between complex and potentially more precise demographic models (Xia et al. 2018). Whole-genome studies reveal that European populations are undifferentiated from each other, apart from Spanish populations which seem to be somewhat differentiated from the rest of Europe (Dvornyk et al. 2002; Pyhäjärvi et al. 2007). Even though there is a slight difference in genetic diversity from studies of mitochondrial genomes compared to other markers, the overall variation is high. However, little is still known regarding the overall genetic diversity and population differentiation in Scandinavia due to the limited sampling and investigation of Norwegian Scots pine populations using a whole-genome approach.

1.4 Phenotypic diversity Phenotypes are the result of genotype expressions under different environmental conditions (Nicotra et al. 2010). Phenotypic diversity plays an important role in the survival of a species in different environments. Scots pine has evolved a clinal adaptation to variations in climatic factors, e.g. temperature and photoperiod (Andersson and Fedorkov 2004). Northern and southern populations often differ in frost hardiness where northern populations are seen to be hardier. Hurme et al. (1997) studied frost hardiness, bud set and the relationship between them in 1-year seedlings in Finland and showed that northern populations develop frost hardiness earlier than southern populations. Northern populations also set their buds significantly earlier than the southern populations. These results show latitudinal clines in both bud set and frost hardiness for Scots pine seedlings, and that the timing of bud set predicts the initiation of frost hardening.

Page 8: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

4

In another study, Andersson Gull et al. (2018) compared shoot phenology of Scots pine populations in Scandinavia and Russia. Their results showed that northern populations started onset and cessation of shoot elongation at a lower temperature sum than southern populations, indicating higher frost hardiness. Adaptation to colder climates can also drive variability in needle characteristics as seen in a study by Jankowski et al. (2017), where populations in northern Sweden have thicker, wider and shorter needles compared to populations in southern Sweden. However, the authors observed that populations in Poland resemble populations in northern Sweden in thickness and width, which they suggest is explained by e.g. moisture availability. In contrast to these findings, a longitudinal difference in frost hardiness is also displayed where populations in continental climate tend to be hardier than populations in maritime climates from corresponding latitudes (Andersson and Fedorkov 2004; Andersson Gull et al. 2018). Since Scots pine is one of the most valuable conifers in Scandinavia, both economically and ecologically (Matías and Jump 2012), it is important to have knowledge about its phenotypic variation and genetic diversity. Different needle, stem or bud characteristics and phenology influence the survival of an individual in a particular environment. The latitudinal variation in cold hardiness seen across Finland and Scandinavia (Hurme et al. 1997; Jankowski et al. 2017) could be related to the dual recolonization history, and the survival in two distinct refugia during LGM might have led to different adaptation patterns with diversifying selection under these circumstances (Zimmer and Sønstebø 2018). These hypotheses, however, have not been properly tested to date. To better clarify the association between phenotypic variation and genetic variation, coordinated investigations on both phenotype and genotype are needed. In addition, due to the polygenic nature of most of the phenotypic traits, genome-wide analysis is necessary for establishing potential association.

1.5 Genotyping-by-sequencing Pinus have an estimated genome size between 22 – 32 giga-base pairs (Gbp), which is on average larger than the genome size of their closest relatives, Picea (19.6 Gbp for P. abies ) (Nystedt et al. 2013; Wegrzyn et al. 2014). Not only is the genome size large but over 80 % is repetitive sequences and highly methylated which cause ambiguous assembly of paralogous loci (Pan et al. 2015). The size of these genomes has been a problem in whole-genome sequencing even with the recent development of next generation sequencing (NGS) technologies (Wegrzyn et al. 2014; Prunier, Verta and MacKay 2016). For large genomes like conifer trees, reduced genome sequencing has become a practical strategy for genome-wide survey in population samples. Restriction site-associated DNA sequencing (RADseq) is a simple and cost-effective NGS method that targets a subset of the genome, i.e. reduced representation method, which is more feasible than sequencing the whole genome in a species with a large genome size (Andrews et al. 2016; Lowry et al. 2017). If a high-quality reference genome is available, the reads can be mapped to it and SNPs can be called as for whole-genome sequencing (Davey et al. 2011). Restriction enzymes (RE) have been utilized for a long time to sample loci across the genome because they reduce genome complexity. It is a simple, highly reproducible and extremely specific approach (Elshire et al. 2011; Andrews et al. 2016). Since pine genomes are highly methylated, and contain large proportions of repetitive sequences, they can cause a skew in depth of coverage due to the high copy number yielded. The large proportion of repetitive sequences present in conifer genomes can be avoided by using methylation-sensitive REs (Davey et al. 2011; Elshire et al. 2011). Genotyping-by-sequencing (GBS) is a less complex version of RADseq with a simpler library preparation protocol (Davey et al. 2011; Elshire et al. 2011). GBS is quite straightforward for small genomes but requires evaluation and selection of REs for large genomes in order to reduce their genome complexity (Elshire et al. 2011).

Page 9: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

5

1.6 Aim Previous studies show a marked phenotypic variation in cold hardiness between southern and northern populations of Scots pine in Scandinavia and western Russia. The genetic diversity across these populations, however, is unknown. The aim of this study is to investigate the genetic diversity and population structure of Scots pine from Scandinavia to Russia in order to re-evaluate the recolonization history, and to explain the phenotypic variation in cold hardiness across this region. By comparing genotypic data with phenotypic data of the same set of populations, it might be possible to find a genetic component that controls the ability of Scots pine to adapt to colder climates. The hypothesis is that the population structure differs between northern and southern populations. This structure is influenced by colonization history, but may also be associated with environmental gradients over the distributions from Scandinavia to Russia. This study addresses the following questions:

I. Is there a population genetic structure in this region? If so, is this structure

associated with phenotypic variation? II. Is there evidence of a “cryptic” refugium in the far north? If so, there might be in

situ populations that differ from populations migrated from Western Europe or the Russian Plain.

III. Does the genotypic variation explain the phenotypic variation seen in Scandinavia?

2 Materials and methods 2.1 Sampling and freeze testing Seed samples collected from 23 (+3 previous) populations ranging from Norway to western Russia (Fig. 1) were included in this study. Seedlings from each population were grown in a greenhouse and needles were collected for DNA isolation before the seedlings were subjected to a freeze test following the procedure described in Andersson and Fedorkov (2004). The needle damage due to exposure to low temperatures was scored for every seedling to provide an average degree of hardiness for each population. A low score indicates low needle damage and therefore higher tolerance to colder environments (Table 1). This test was established by The Forestry Research Institute of Sweden (Skogforsk) at their research station in Sävar, Sweden. Needles were collected from the greenhouse in February 2018 and stored in a freezer until their DNA was isolated and genotyped.

Figure 1. Sampling locations. Populations marked with d (in Sweden) were DNA isolated and genotyped before this study and added to the analysis.

Page 10: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

6

Table 1. Population information. Hardiness determined by damage on needles after freeze test. N: the number of seedlings tested for hardiness. G: the number of individuals genotyped. Population d5 was not freeze tested due to its southern origin but is treated as a sensitive (4.00 - 4.99) population in this study.

Population ID Location Country Lat. Long. Hardiness N G

26 Rovaniemen mlk Finland 66.9 25.4 1.00 - 1.99 126 20

29 Pudasjärvi Finland 65.5 27.6 1.00 - 1.99 102 19

41 Kuhmo Finland 64.1 29.6 2.00 - 2.49 122 65

39 Äänekoski Finland 62.8 25.7 3.00 - 3.99 99 20

48 Riistina Finland 61.5 27.4 4.00 - 4.99 25 17

13 Uusikaupunki Finland 60.9 21.3 4.00 - 4.99 60 50

5 Alta, Stengelsen Norway 69.9 23.3 0 - 0.99 105 60

31 Kirkesmoen, Troms Norway 69.0 19.2 0 - 0.99 104 65

37 Beiarn, Nordland Norway 66.9 14.3 2.50 - 2.99 103 60

42 Hemne, Sør-Trøndelag Norway 63.3 9.0 3.00 - 3.99 103 65

2 Molde, Gjemnes, Skodje Norway 62.8 7.5 4.00 - 4.99 104 60

25 Vågå, Oppland Norway 61.5 9.0 3.00 - 3.99 102 20

20 Ringerike, Buskerud Norway 60.4 10.0 4.00 - 4.99 104 60

54 Archangelsk stand 13 Russia 64.5 40.7 2.50 - 2.99 87 57

35 Archangelsk stand 5 Russia 62.1 40.6 3.00 - 3.99 100 18

17 Archangelsk stand 3 Russia 61.2 46.2 2.50 - 2.99 103 20

4 Archangelsk plantation Russia 61.0 42.3 3.00 - 3.99 103 24

28 Udora, Komi Russia 64.3 49.2 1.00 - 1.99 79 19

53 Sosnogorsk, Komi Russia 63.8 54.8 1.00 - 1.99 8 8

11 Megdurechensk, Komi Russia 63.1 50.8 2.00 - 2.49 83 19

45 Pomozdino, Komi Russia 62.1 54.3 2.50 - 2.99 78 62

21 Hammarstrand Sweden 63.1 16.2 3.00 - 3.99 123 20

10 Gunnilbo Sweden 59.9 15.8 4.00 - 4.99 90 60

d5 Skillingaryd Sweden 57.4 14.0 NA NA 50

d21 Lillberget Sweden 64.3 19.5 0 – 0.99 76 50

d60 Kuttainen Sweden 68.2 22.8 0 – 0.99 85 50

Total 2274 1038

2.2 DNA isolation DNA from 936 individuals was isolated using E.Z.N.A. SP Plant DNA Kit (Omega Bio-Tek) protocol for fresh/frozen samples with slight moderations. The isolated DNA samples were quantified using a SynergyTM HTX Multi-Mode Reader (BioTekTM) with Take3 protocol to calculate the volume for 800 ng DNA. In this experimental system, 300 DNA samples are pooled into each library including a few samples as replicates. Samples with low DNA concentration were removed giving a total number of 888 individuals for GBS preparations. 800 ng of DNA from each sample was transferred into a new 96-well plate and dried for 2 - 2.5 h using SavantTM DNA 120 SpeedVacTM and then resuspended in 40 µL ddH20 overnight. The DNA samples were quantified again using Qubit dsDNA HS Assay Kit (Thermo Fisher ScientificTM), and 200 ng DNA from each sample was drawn and dried for further use in the GBS library preparations.

2.3 GBS library preparations The GBS libraries were prepared following the method presented in Pan et al. (2015). 200 ng DNA from each sample was digested using the restriction enzyme PstI-HF® (CTGCAG) (New England Biolabs® Inc.). Forward and reverse adapters were ligated to each of the resulting fragments using T4 DNA ligase (New England Biolabs® Inc.). Individual barcodes were attached to the forward adapter to identify reads from each individual after sequencing. This was carried out at 37 ℃ for 8 h followed by 65 ℃ for 30 min. After digestion and ligation, 300 samples were pooled into each library. 900 µL of the pooled sample was drawn and purified using QIAquick PCR Purification Kit (Qiagen), and the concentration was measured using Qubit dsDNA HS Assay Kit. This sample was PCR (polymerase chain reaction) amplified using Q5® High-Fidelity DNA polymerase (New

Page 11: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

7

England Biolabs® Inc.) with a program of 72 ℃ for 2 min, 98 ℃ for 30 sec, followed by 15 cycles of *98 ℃ for 10 sec, 65 ℃ for 30 sec and 72 ℃ for 20 sec*, ending with 72 ℃ for 2 min. The PCR product was purified again using QIAquick PCR Purification Kit and the concentration was measured using Qubit dsDNA HS Assay Kit. The purified sample was loaded on to Invitrogen E-gel® EX 2 % Agarose gel (Thermo Fisher ScientificTM) on an E-gel® iBaseTM Power system, and fragments between 350 to 450 base pairs (bp) were excised for purification using QIAquick Gel Extraction Kit (Qiagen). The final DNA concentration was measured using Quant-iTTM PicoGreenTM dsDNA Assay Kit (Thermo Fisher ScientificTM), and the libraries were submitted to Novogene(HK) Company Limited in Hong Kong, China, for sequencing on Illumina HiSeq X. In total, three GBS libraries were prepared in this study.

2.4 Bioinformatics Stacks (Catchen et al. 2013) is a pipeline for building loci from short-read sequences and designed to work with any restriction enzyme based GBS data. It was developed to help minimize the challenges of using GBS methods for genetic studies. Stacks: process_radtags was used to clean and separate the raw sequence reads according to their individual barcodes (demultiplexing) (Catchen et al. 2013). The demultiplexing results showed that all 888 samples each had more than 100,000 reads, and could be used in the study. Three additional populations, each with 50 individuals, were added to increase the number of populations from Sweden. These samples had already been genotyped before this study. In total, 1038 individuals from 26 populations were analyzed in this study. All barcode sequences were removed using Trimmomatic (Bolger, Lohse and Usadel 2014) and the sequences were mapped to the reference genome, Pinus taeda (Zimin et al. 2014), by using Burrows-Wheeler Aligner (Li and Durbin 2009). SNPs were called using SAMtools/BCFtools (Li et al. 2009).

2.5 SNP filtering Filtering was initially done by removing SNPs within 5 bp from an indel by using BCFtools. The remaining filtering was conducted with VCFtools (Danecek et al. 2011) by first removing all indels so only the SNPs were remaining. These first two filtering steps prevent falsely called SNPs due to indel misalignment. Secondly, all SNPs with a mapping quality of less than 40 % to the reference genome were removed followed by masking genotypes with genotype quality (GQ) <20 and read depth (DP) <5 as missing. Lastly, loci with a missing rate of >10%, minor allele frequency (MAF) <5%, heterozygosity >70% or with an allele number higher than 2 were removed. The filtering procedure was relatively stringent (allowed for only 10 % missing) to diminish a library effect when combining the two northern populations from Sweden (d21 and d60) with the rest of the populations.

2.6 Structure analysis and population differentiation fastStructure (Raj, Stephens and Pritchard 2014) is a variational framework for inferring population structure from large SNP data, and was used to investigate whether a population genetic structure is present in the Scots pine samples. The choice of K, model complexity, is set by testing a range of assumed values that will generate the optimal K. However, it is also likely that no “true” value of K can be reached since real populations never conform exactly to the model. Bar plots of the clusters for different K were created using Distruct 2.2 (Rosenberg 2004). Relatedness among individuals in each population was investigated with relatedness estimators (Queller and Goodnight 1989; Ritland 1996) to remove highly related samples that might be collected from the same mother tree, or that one or more individuals were unknowingly collected twice during needle sampling. A high relatedness detected between two samples within a population might influence the structure of that population, and thus only one of those related individuals was further used in the analysis.

Page 12: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

8

Genetic differentiation among populations was determined using pairwise FST (Weir and Cockerham 1984) in Arlequin 3.5 (Excoffier and Lischer 2010), where the statistical significance was assessed by permutation test (1023 permutations, significance level = 0.05). Arlequin was also used to perform analysis of molecular variance (AMOVA) (Excoffier, Smouse and Quattro 1992) with F-statistics to estimate the proportion of genetic variability found among groups, among populations within groups and within populations. AMOVA was run with populations grouped according to the genetic clusters found by fastStructure and significance associated with the fixation index was evaluated through permutation (1023 permutations). In order to use Arlequin the VCF-file containing the final SNP data set was first converted to Plink PED file using VCFtools and then converted to standard Arlequin file using PGDSpider 2.1.1.5 (Lischer and Excoffier 2012).

2.7 Genetic diversity Observed (Ho) and expected (He) heterozygosity, as well as nucleotide diversity (π) were calculated within populations, within clusters and overall using Arlequin 3.5. Populations were grouped according to the genetic clusters found by fastStructure when calculating Ho, He and π within clusters.

3 Results 3.1 Sequencing data quality and GBS results GBS is a relatively long protocol. Each step in the GBS preparation procedure, from DNA isolation and quantification to size selection, can influence the sequence data recovery and quality, which will further impact the genetic inferences. To ensure the reliability of the data, a quality control was performed in each step of the procedure. A summary of the sequence data quality from all three GBS libraries is presented in Table 2. Table 2. Data quality summary for the three GBS libraries. Error rate: base error rate, Q20, Q30: (Base count of Phred value >20 or 30) / (Total base count), Phred quality value is a logarithmical relationship between sequencing error rate and sequencing base quality value. A Q-score of 20 has a base call accuracy of 99% and Q30 has a base call accuracy of 99.9% (Ewing and Green 1998).

Sample Raw reads Raw data (Gb) Effective (%) Error (%) Q20 (%) Q30 (%)

JO1 365,381,336 109.6 97.16 0.02 96.14 90.75

JO2 354,458,894 106.3 98.05 0.02 96.30 91.05

JO3 364,485,656 109.3 98.51 0.02 96.03 90.59

Each library was rather successful with an even recovery of sequence reads. Each individual had 2,355,433 reads on average with a mapping rate of 96.36 % to the reference genome, P. taeda (Table 3). The average genome coverage was 2.53 mega-base pairs (Mbp) and the average sequence depth was 87x. Table 3. Summary of GBS results for the analyzed Scots pine seedlings. Numbers in parentheses are the ±1 standard error. Reads with a low coverage or depth (less than or equal to 5x) were discarded. Coverage is denoted as the expected coverage based on the number and the length of high-quality reads after alignment and depth is the redundancy of coverage (Sims et al. 2014).

Mean Median

Reads per individual 2,355,433 (±94,478) 1,630,707

Coverage (Mbp, >=5x) 2.53 (±0.026) 2.45

Depth (>=5x) 87x (±1.51) 32x

Mapping rate 96.36 % (±0.153) 97.41 %

3.2 Structure analysis Four individuals were removed due to low coverage. After hard filtering but without MAF-filter, 99,820 SNPs were recovered from 1034 individuals; and with 5 % MAF-filter, 11,019 SNPs were kept. Relatedness analysis identified 202 individuals as highly related (more related than 1st cousin). Highest relatedness was found in population 54, where 45 out of 57 individuals were filtered away (Table 4). No clear geographic pattern could be detected regarding

Page 13: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

9

relatedness across the sampled region, although three Norwegian populations have relatively high relatedness compared to all except population 54 in Russia. Apart from these highly related populations, the number of related individuals in each population seem to be randomly distributed. Table 4. Number of related and unrelated individuals in each population.

Population ID Country Related Unrelated Total 13 Finland 17 33 50 26 Finland 0 20 20 29 Finland 0 19 19 39 Finland 4 16 20 41 Finland 1 64 65 48 Finland 7 10 17 2 Norway 21 38 59 5 Norway 1 59 60 20 Norway 26 34 60 25 Norway 2 18 20 31 Norway 3 62 65 37 Norway 11 49 60 42 Norway 25 40 65 4 Russia 6 18 24 11 Russia 1 18 19 17 Russia 0 20 20 28 Russia 0 19 19 35 Russia 3 15 18 45 Russia 9 53 62 53 Russia 0 8 8 54 Russia 45 12 57 10 Sweden 3 56 59 21 Sweden 1 19 20 d5 Sweden 2 47 49 d21 Sweden 3 46 49 d60 Sweden 11 39 50 Total 202 832 1034

After removing the related individuals, 832 individuals and 6,034 SNPs among them were used in the population structure analysis. Based on fastStructure, a range from K=2 to K=29 were tested, and the optimal K was found from 2 – 3 (Fig. 2). Further increasing K from 3 did not identify more genetic components in the populations and the grouping pattern remained unchanged (Fig. 2, K=3 vs. K=4). The difference between K=2 and K=3 was that an extra component, as represented by populations d21 and d60, was detected under K=3. These two populations were genotyped in a previous GBS preparation, thus this difference is most likely due to the library effect even though stringent filtering was applied. Thus, K=2 was selected as the most likely situation in these samples.

Page 14: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

10

Figure 2. Bar plot showing the results of genetic composition from fastStructure where each color represents a different ancestry. The plot is arranged according to longitudinal distribution of the sampled populations from west to east. The spatial distribution of the two genetic clusters showed a clear clinal pattern of variation going from west to east, where one polymorphism is dominant in Norway and Sweden and another in Russia (Fig. 3). Finnish populations have relatively high level of admixture between both polymorphisms, but consist mainly of the polymorphism present in Sweden and Norway. Thus, these clusters are henceforth defined as Fennoscandian and Russian.

Figure 3. Pie chart showing genetic composition at K=2 for each population.

3.3 Population differentiation AMOVA results (Table 5) show that the greatest diversity is among individuals within populations, where the variation explains 98.8 % of the total variance. Differentiation between the two genetic clusters is very low with only 0.66 % variation (FST = 0.01202), this low differentiation is also observed in the pairwise comparisons between the sampled populations (Appendix 1). These data suggest a weak population structure and differentiation in Scots pine across Fennoscandia and Russia, likely caused by extensive gene flow across the whole region.

Page 15: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

11

Table 5. Analysis of molecular variance based on 1023 permutations (significance level 0.05) over 6034 loci.

Source of variation d.f. Sum of squares

Percentage of variance

P

Among groups 1 3570.866 0.66 <0.05 Among populations within groups

24 23654.149 0.54 <0.05

Within populations 1638 1200883.720 98.80 <0.05 Fixation index: FST = 0.01202.

3.4 Genetic diversity in Scots pine The overall genetic diversity, as measured by Ho, He and π, in Scots pine is high although all measurements of genetic diversity show a slightly higher value in Fennoscandian populations compared to Russian populations (Table 6). Ho and π were 0.2496 and 0.2365, respectively, in the Russian cluster, while the corresponding values were 0.2592 and 0.2461 in the Fennoscandian cluster. Ho and He are almost equal both overall and for each cluster meaning that there is extensive gene flow between populations in this area. This pattern is seen throughout all populations, in accordance with the high outbreeding of the species (Table 6). Table 6. Genetic diversity measured as observed (Ho) and expected heterozygosity (He) and nucleotide diversity (π) for all populations. Heterozygosity is measured as an average over all polymorphic sites in each population and nucleotide diversity is measured as an average over all loci. N is the number of individuals remaining in each population after the last filtering procedure. S.E. is ±1 standard error of the mean.

Population ID Country N H0 S.E. He S.E. π S.E. 17 Russia 20 0.2726 0.0022 0.2809 0.0019 0.2542 0.0016 53 Russia 8 0.3230 0.0027 0.3252 0.0021 0.2512 0.0016 28 Russia 19 0.2639 0.0022 0.2847 0.0020 0.2274 0.0014 35 Russia 15 0.2302 0.0020 0.2884 0.0019 0.2199 0.0014 11 Russia 18 0.2528 0.0021 0.2828 0.0019 0.2332 0.0015 54 Russia 12 0.2784 0.0024 0.3065 0.0020 0.2128 0.0014 45 Russia 53 0.2578 0.0020 0.2723 0.0019 0.2377 0.0015 4 Russia 18 0.2663 0.0022 0.2846 0.0019 0.2454 0.0015 Russian cluster 163 0.2496 0.0018 0.2704 0.0018 0.2365 0.0014 29 Finland 19 0.2830 0.0023 0.2794 0.0019 0.2523 0.0016 26 Finland 20 0.2507 0.0021 0.2814 0.0019 0.2512 0.0016 39 Finland 16 0.2590 0.0022 0.2903 0.0019 0.2350 0.0015 41 Finland 64 0.2514 0.0019 0.2723 0.0018 0.2436 0.0015 13 Finland 33 0.2477 0.0020 0.2751 0.0019 0.2266 0.0014 48 Finland 48 0.2929 0.0025 0.3098 0.0020 0.2458 0.0016 31 Norway 62 0.2381 0.0019 0.2711 0.0018 0.2339 0.0014 25 Norway 18 0.2350 0.0021 0.2829 0.0020 0.2302 0.0014 37 Norway 49 0.2624 0.0020 0.2727 0.0018 0.2482 0.0015 5 Norway 59 0.2198 0.0018 0.2722 0.0018 0.2262 0.0014 42 Norway 40 0.2639 0.0021 0.2732 0.0019 0.2504 0.0015 20 Norway 34 0.2768 0.0022 0.2755 0.0019 0.2606 0.0016 2 Norway 38 0.2650 0.0021 0.2745 0.0019 0.2409 0.0015 21 Sweden 19 0.2387 0.0021 0.2875 0.0019 0.2260 0.0014 10 Sweden 56 0.2558 0.0020 0.2740 0.0018 0.2520 0.0015 d21 Sweden 46 0.3093 0.0024 0.2859 0.0018 0.2647 0.0016 d60 Sweden 39 0.3016 0.0024 0.2880 0.0018 0.2621 0.0016 d5 Sweden 47 0.2948 0.0023 0.2777 0.0018 0.2467 0.0015 Fennoscandian cluster

669 0.2592 0.0018 0.2758 0.0017 0.2461 0.0015

Overall 832 0.2573 0.0018 0.2754 0.0017 0.2448 0.0015

4 Discussion 4.1 Genotyping-by-sequencing Three GBS libraries were prepared in this study. Results from the data quality control showed that all libraries were successful with an even recovery of sample reads. Each library overlaps almost perfectly in base pair size range, which is crucial for keeping as much data as possible when combining sequences from multiple libraries for joint analysis. Each individual had an

Page 16: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

12

average mapping-rate to P. taeda of 96.36 % meaning that these sequencing reads are of high quality, giving no reason to expect any contamination. Each nucleotide is represented 87 times in these random raw sequences as denoted by the estimated average depth value. The high value of depth rescues insufficiencies in the sequencing methods (e.g. sequencing errors) (Sims et al. 2014). Although GBS is a rather new method used in evolutionary genomics, it has developed quickly for species with smaller genomes compared to conifers (Elshire et al. 2011; Pan et al. 2015; Andrews et al. 2016). Since working with conifer genomes has posed a challenge due to their sizes and large amount of repetitive sequences, the protocol used in GBS is constantly under improvement to reduce the complexity for sequencing requirements and thus sequence at a lower cost (Elshire et al. 2011). The selection of RE is a key step when preparing GBS libraries with conifers. PstI is sensitive to cytosine methylation and able to reduce the repetitive sequences in pine genomes to a level of only 34 % (Pan et al. 2015), making it a useful RE for working with conifer genomes. However, there are potential problems using GBS for genetic investigations, one of them being allele dropouts. Unfortunately, this kind of problem cannot be controlled for since allele dropouts occur when there is a mutation at the recognition site resulting in a failure of the RE to cut at that position. The cutting failures will lead to null alleles (alleles that are not recognized and therefore not sequenced) and could potentially be a cause of genotyping errors if SNPs are present within these null alleles. Individuals that are heterozygous for the allele are seen as homozygous instead because that SNP is lost in the sequence reads. Genotyping errors produced in this way can lead to an underestimation of genomic diversity or overestimation of FST (Andrews et al. 2016). One of the main struggles with the GBS data set in this study was the occurrence of a library effect when combining these libraries with previous data. In order to merge population d21 and d60 with the rest of the populations, a stringent filtering procedure was applied to control for missing rate, which led to many SNPs being filtered away. Even with this stringent filtering, the library effect was still present in the structure analysis. The reason for this effect is most likely due to different size ranges in the size selection step during GBS library preparation. If the libraries do not have a good enough overlap, loads of data will be lost.

4.2 Low population structure and high genetic diversity in Scots pine The structure analysis detected two genetic clusters across the large sampled area, one dominant in Russia and one dominant in Norway and Sweden. This indicates a weak population structure for Scots pine in northern Europe with no distinct subgroups in the area; in accordance with previous studies of Scots pine population structure (Dvornyk et al. 2002; Pyhäjärvi et al. 2007; Zimmer and Sønstebø 2018). The low population differentiation shows the impact of extensive gene flow among populations across the sampled region. Pines are efficient in pollen dispersal resulting in decreased overall population structure and differentiation as seen by the low FST values presented in the pairwise estimations. Scots pine is highly outbreeding, which secures high genetic diversity and low levels of inbreeding in natural populations. This is also confirmed by the high genetic diversity seen in these populations, as well as in other studies of Scots pine in northern Europe (e.g. Wójkiewicz, Litkowiec and Wachowiak 2016; Dering et al. 2017; Grivet et al. 2017; Zimmer and Sønstebø 2018), showing that efficient pollen dispersal has a homogenizing effect over the whole distribution range. However, Naydenov et al. (2007) found high population differentiation between populations ranging from Western Europe to eastern Russia. The authors also discovered four genetically distinct lineages in contrast to the two clusters found in this study. These different results are likely due to different marker choice where Naydenov et al. studied mitochondrial DNA, which is maternally inherited and dispersed by seeds, in contrast to the biparentally inherited nuclear DNA markers used in this study. Studying mitochondrial DNA markers generally reveals higher population differentiation, due to the low dispersal ability of seeds, which is also

Page 17: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

13

observed in other studies where mitochondrial DNA was used (Pyhäjärvi, Salmela and Savolainen 2008; Semerikov et al. 2018). Nuclear DNA markers are dispersed by both pollen and seeds, which causes the population differentiation to decrease (Wójkiewicz, Litkowiec and Wachowiak 2016; Zimmer and Sønstebø 2018). Although this difference is observed for nuclear and mitochondrial DNA markers, the distribution of genetic clusters in this study corresponds to the distribution found in previous studies indicating that the different choice of markers tells a somewhat similar story. Relatedness analysis revealed a large number of related individuals in some populations. One of the Russian populations underwent a substantial reduction in population size because 45 out of 57 individuals were highly related. This high number of related individuals is probably due to mating among relatives or sampling from limited mother trees, which would lead to decreased heterozygosity if they had not been removed. Whether the related individuals present in the populations are due to sampling bias or reflect true mating situation requires further investigation with additional sampling.

4.3 Glacial refugium and postglacial migration The results presented in Figure 3 display a clear distinction between Fennoscandian populations and Russian populations, although not differentiated as seen by the 0.66 % variation between them. Populations in Norway and Sweden are relatively pure compared to populations in Finland and Russia, and could be an indication of a local origin of the “red” polymorphism. This would, in that sense, support the hypothesis stated by Kullman (2008) and Parducci et al. (2012) that conifers survived in ice-free refugia located in northern Scandinavia during the glaciation, and that modern populations are descendants from ancient populations derived from these northern locations. Russian populations show high levels of the “blue” polymorphism and low levels of the “red” polymorphism. These results would support an eastern origin of the “blue” polymorphism and a postglacial recolonization of Scots pine in Scandinavia from the Russian Plain. A dual recolonization of Scandinavia is possible if the “red” polymorphism originates from Western Europe. However, since this study does not include any southern or Western European populations no reliable conclusions can be made regarding the origin of the polymorphism in Norway and Sweden. It is still relevant to consider a western origin due to results presented in other studies on Scandinavian recolonization history (Taberlet et al. 1998; Sinclair, Morman and Ennos 1999; Dering et al. 2017). Although a western origin is possible, a local origin seems more likely since the “red” polymorphism is present at low levels in the Russian populations. This could indicate a probable origin from an ancient polymorphism that is remaining at low levels in modern populations due to extensive gene flow across this region. The high level of the “red” polymorphism in Norwegian and Swedish populations and the relatively high level of the “blue” polymorphism in Russia gives an impression that there are two main recolonization routes. Whether the “red” polymorphism originated from western Scandinavia (local origin) or Western Europe, it slowly expanded northward and eastward into Finland where it eventually mixed with the “blue” polymorphism from eastern Russia as seen by the relatively high amount of admixture in Finnish populations. The majority of studies using modern or ancient DNA analyses on postglacial recolonization history include mitochondrial DNA markers (Sinclair, Morman and Ennos 1999; Pyhäjärvi, Salmela and Savolainen 2008; Buchovska et al. 2013). It is easier to follow possible recolonization routes in terms of population differentiation with mitochondrial DNA since it has a limited dispersal ability. Even if mitochondrial DNA might be preferred in that sense, it may not capture the entire evolutionary history due to potential bias of sequence capture and their small effective population size, which results in easy fixation of the mitochondrial genome (Xia et al. 2018). Nuclear DNA-based studies are needed to capture the entire history of the species and are also more useful when studying the phylogeography and gene flow patterns in Scots pine due to the dispersal ability of pollen (Tóth et al. 2017).

Page 18: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

14

4.4 Phenotypic variation is not coupled with genetic variation Another interesting result from this study is that phenotypic variation and genetic variation do not follow each other. The phenotypic results from Skogforsk display a clear clinal variation in cold hardiness from a latitudinal perspective (Fig. 4). Northern populations show less needle damage than southern populations, which indicates a clear local adaptation in Scots pine. This south-north trend in hardiness is not followed by a similar genetic pattern. However, there is also a longitudinal difference where Russian populations tend to be slightly hardier than Scandinavian populations at the same latitudinal gradient. This longitudinal difference has been seen in other studies where populations subjected to continental climate seem to be hardier than populations in maritime climate (Andersson and Fedorkov 2004; Andersson Gull et al. 2018). The genetic composition follows a longitudinal variation from east to west with two distinct clusters. Russian populations differ from Scandinavian populations with Finland having relatively high admixture between both polymorphisms, although containing larger proportions of the Scandinavian polymorphism than the Russian group.

Figure 4. Kriging map showing the variation in cold hardiness over the sampled geographical range. Bar to the right presents the degree of hardiness after freeze testing where yellow (high score) indicates high sensitivity and blue (low score) indicates high tolerance to colder environments. Sampling locations are shown as blue dots.

Several possible hypotheses can be considered to explain why no clear relationship between genotypic and phenotypic variation could be detected. Firstly, our genome sampling might be too limited to detect any link between phenotype and genotype. Secondly, cold hardiness could be a polygenic trait, i.e. a specific phenotype can be controlled by many different genes, and all these genes make a small contribution to the overall outcome. Polygenic traits are difficult to detect on a genotypic level. Lastly, no clear relationship between genotype and phenotype is expected if phenotypic plasticity is controlled by epigenetic regulation. Pine genomes contain over 20 billion base pairs and a substantial portion of the genome consists of repetitive regions. GBS is a reduced representation method that makes it easier and more cost-effective to sequence species with large genomes. Despite these appealing advantages, GBS does not come without concerns. Reduced representation sequencing, by definition, merely sample a small fraction of the genome resulting in high probability for SNPs being recovered from non-coding regions, or regions not linked to the studied traits, making detection of SNPs related to the genetic basis of adaptation almost impossible. For example, the average coverage in this study is 2.53 Mbp meaning that we sampled less than 0.01 % of the total pine genome. Other concerns involve the unknown genetic background of adaptation in relation to genome size and complexity, or the evolutionary processes influencing the genetic

Page 19: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

15

basis of adaptation (Parchman et al. 2018). Reduced representation methods are therefore considered underpowered and cannot be expected to identify a meaningful portion of genic regions involved in adaptation (Tiffin and Ross-Ibarra 2014). Nevertheless, GBS is still a powerful method for detecting structural variants and offers a cost-effective way to generate genome-wide data for large numbers of individuals. The clinal variation in cold hardiness reflect adaptation of Scots pine to local climate in this region. A locally adapted population has a higher fitness than populations introduced to that area (Savolainen, Pyhäjärvi and Knürr 2007). This is seen with the southern populations being highly sensitive to colder environments compared to northern populations that are regularly exposed to lower temperatures. A phenotypic cline like this is common in long-lived tree species with large distribution ranges (Ducousso, Guyon and Krémer 1996; Viherä-Aarnio et al. 2005; Jankowski et al. 2017). The high genetic diversity seen in Scots pine is a key component in facilitating its adaptability to different environmental conditions. With ongoing climatic change, scientists have become more and more interested in the genetic basis of adaptation in widely distributed species. The observed phenotypic difference in cold hardiness between southern and northern populations can be an indication of phenotypic plasticity in this species. Phenotypic plasticity describes the ability of an organism to change its phenotype in response to changes in the environment and considered highly important for climate change adaptation in plant species (Chevin, Lande and Mace 2010; Nicotra et al. 2010). Related populations that differ in the expression of the genotype display adaptive plasticity to that particular environment (Bradshaw 2006). However, the genetics behind plasticity are still poorly understood. Franks and Hoffmann (2012) mention epigenetics as study approach for investigating the genetic basis of climate change adaptations. Epigenetics is the study of gene regulation and inheritance via mechanisms that are not explained by changes in the DNA sequence. These epigenetic changes are processes on a molecular level that activate, reduce or disable certain genes, and have been proposed as key for explaining phenotypic plasticity (Bossdorf, Richards and Pigliucci 2008; Duncan, Gluckman and Dearden 2014). If an epigenetic regulation of plasticity is occurring for this trait, no clear genotypic and phenotypic relationship is expected in these populations. Some studies have shown that environmentally induced epigenetic changes could be inherited for several generations (Richards 2006; Whitelaw and Whitelaw 2006). If these changes are heritable, phenotypic traits, such as cold hardiness, can be passed on from parent to progeny without altering the gene sequence resulting in a distinct phenotypic variation relevant to rapid adaptation (Franks and Hoffmann 2012). The large genome sizes in conifers provide excessive amount of genomic DNA in which epigenetic regulations could take place. However, in order to determine whether the difference in phenotypic and genetic variation is due to epigenetic changes, more research is needed both in terms of epigenetics in natural populations and clinal phenotypic variation in long-lived and widely distributed species such as pines. The phenotypic variation seen in these populations could also depend on several genes, which would be difficult to detect on a genotypic level and thus result in no clear relationship between phenotype and genotype. Traits involved in adaptation are often relying on more than one gene, where all genes contribute additively to the phenotypic outcome (Savolainen, Pyhäjärvi and Knürr 2007). Some studies about cold hardiness in trees have suggested polygenic inheritance as a possible explanation to the varying cold tolerance across environmental gradients (Norell et al. 1986; Hurme et al. 1997; Howe et al. 2003). Similar suggestions can be made in this study since a continuous increase in cold hardiness is seen from south to north and the genetic variation is not following the same gradient. However, adaptation per se is known to be highly polygenic (Yeaman et al. 2016), and studies regarding adaptation to climate change are facing difficulties to recover enough genetic evidence to conclude whether the phenotypic variation between populations is due to local adaptation or merely a plastic response (Savolainen, Lascoux and Merilä 2013). Even with global warming and rising temperatures around the world, studies on cold hardiness are of importance since conifers in

Page 20: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

16

higher latitudes may suffer from spring frost due to earlier growth onset (Andersson Gull et al. 2018). Climate change is also expected to create a temporal shift in seasons where cold events could start earlier or later with respect to the winter period (Varsamis et al. 2019). The ability for a species to adapt to environmental fluctuations is therefore crucial for population persistence in a changing climate.

4.5 Conclusions This study focused on the genetic diversity and population structure of Scots pine from Scandinavia to Russia in order to re-evaluate the colonization history of the species into northern Europe and to explain the variation observed in an adaptive trait, cold hardiness, across the large distribution range. A weak population structure was found with two distinct genetic clusters arranged in a transition from east to west, suggesting dual colonization of Scandinavia. However, these clusters are not differentiated, as seen by the results observed in the AMOVA analysis. The “blue” polymorphism clearly displays an origin from eastern Russia, but further investigation is needed to determine the origin of the “red” polymorphism, which is dominant in Norway and Sweden. Future studies should include populations from southern and western parts of Europe to determine the origin of that polymorphism. In contrast to previous studies regarding cold hardiness, this study investigated both phenotypic variation and genetic composition in the same set of populations revealing a complex subject in need of further investigation. Scots pine populations across the distribution display a clear clinal variation in cold hardiness reflecting local adaptation. However, this phenotypic cline is not coupled with genetic diversity in the GBS data. A species with a widespread distribution covering different environmental conditions usually shows more adaptive phenotypic variations than species with smaller distribution areas. Gene flow via pollen dispersal is extensive in pines, but in order to establish in new environments the species has to be able to express different phenotypes. Many factors, both genetic and environmental, determine the phenotypic outcome and the difference between phenotypic and genotypic variation becomes, in that sense, difficult to understand. Adaptation is known to be highly polygenic, making the genetic basis of adaptation a highly complex subject in need of more extensive genome scans for further investigation.

Page 21: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

17

Acknowledgements First and foremost, to my supervisor Xiao-Ru Wang; I am forever grateful for your support

and encouragement throughout this year, and for always finding time to answer my questions no matter how simple they were. Thank you for believing in me!

Huge thanks to Alisa Kravtsova and Yuqing Jin who introduced me to all thing’s lab related. I would not have been able to work this effectively if it weren’t for your detailed and thorough introduction on how to extract DNA and prepare GBS libraries. David Hall and Wei Zhao,

you guys receive special thanks for trying to help me understand the world of bioinformatics. It is still a bit confusing to me, but with time and practice I believe I will be as awesome as

you are!

It has been a great honor to be a part of this research team!

Last, but not least, to my family and friends; thanks for putting up with me this year! It has not always been easy, but your feedback, support and random hugs got me where I am today.

Thank you!

Page 22: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

18

References Abdel-Mawgood, A. L. 2012. DNA based techniques for studying genetic diversity. In Caliskan,

M. (ed.). Genetic Diversity in Microorganisms. Rijeka: InTech. Andersson, B. and Fedorkov, A. 2004. Longitudinal differences in Scots pine frost hardiness.

Silvae Genetica 53: 76-80. Andersson Gull, B. A., Persson, T., Fedorkov, A. and Mullin, T. J. 2018. Longitudinal

differences in Scots pine shoot elongation. Silva Fennica 52: 10040. Andrews, K. R., Good, J. M., Miller, M. R., Luikart, G. and Hohenlohe, P. A. 2016. Harnessing

the power of RADseq for ecological and evolutionary genomics. Nature Reviews Genetics 17: 81-92.

Bennett, K. D., Tzedakis, P. C. and Willis, K. J. 1991. Quaternary refugia of North European trees. Journal of Biogeography 18: 103-115.

Bogdan, I. K., Kajba, D., Šatović, Z., Schüler, S. and Bogdan, S. 2018. Genetic diversity of pedunculate oak (Quercus robur L.) in clonal seed orchards in Croatia, assessed by nuclear and chloroplast microsatellites. South-East European Forestry 9: 29-46.

Bolger, A. M., Lohse, M. and Usadel, B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30: 2114-2120.

Bossdorf, O., Richards, C. L. and Pigliucci, M. 2008. Epigenetics for ecologists. Ecology Letters 11: 106-115.

Botstein, D., White, R. L., Skolnick, M. and Davis, R. W. 1980. Construction of a genetic-linkage map in man using restriction fragment length polymorphisms. American Journal of Human Genetics 32: 314-331.

Bradshaw, A. D. 2006. Unravelling phenotypic plasticity - why should we bother? New Phytologist 170: 644-648.

Buchovska, J., Danusevičius, D., Baniulis, D., Stanys, V., Šikšnianienė, J. B. and Kavaliauskas, D. 2013. The location of the northern glacial refugium of Scots pine based on mitochondrial DNA markers. Baltic Forestry 19: 2-12.

Carlisle, A. and Brown, A. H. F. 1968. Biological flora of the British isles: Pinus sylvestris L. Journal of Ecology 56: 269-307.

Catchen, J., Hohenlohe, P. A., Bassham, S., Amores, A. and Cresko, W. A. 2013. Stacks: an analysis tool set for population genomics. Molecular Ecology 22: 3124-3140.

Cheddadi, R., Vendramin, G. G., Litt, T., François, L., Kageyama, M., Lorentz, S., Laurent, J. M., de Beaulieu, J. L., Sadori, L., Jost, A. and Lunt, D. 2006. Imprints of glacial refugia in the modern genetic diversity of Pinus sylvestris. Global Ecology and Biogeography 15: 271-282.

Chen, X. and Sullivan, P. F. 2003. Single nucleotide polymorphism genotyping: biochemistry, protocol, cost and throughput. The Pharmacogenomics Journal 3: 77-96.

Chevin, L. M., Lande, R. and Mace, G. M. 2010. Adaptation, plasticity, and extinction in a changing environment: Towards a predictive theory. Plos Biology 8: e1000357.

Clark, P. U., Dyke, A. S., Shakun, J. D., Carlson, A. E., Clark, J., Wohlfarth, B., Mitrovica, J. X., Hostetler, S. W. and McCabe, A. M. 2009. The last glacial maximum. Science 325: 710-714.

Danecek, P., Auton, A., Abecasis, G., Albers, C. A., Banks, E., DePristo, M. A., Handsaker, R. E., Lunter, G., Marth, G. T., Sherry, S. T., McVean, G., Durbin, R. and Group, G. P. A. 2011. The variant call format and VCFtools. Bioinformatics 27: 2156-2158.

Davey, J. W., Hohenlohe, P. A., Etter, P. D., Boone, J. Q., Catchen, J. M. and Blaxter, M. L. 2011. Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nature Reviews Genetics 12: 499-510.

Dering, M., Kosiński, P., Wyka, T. P., Pers-Kamczyc, E., Boratyński, A., Boratyńska, K., Reich, P. B., Romo, A., Zadworny, M., Żytkowiak, R. and Oleksyn, J. 2017. Tertiary remnants and Holocene colonizers: Genetic structure and phylogeography of Scots pine reveal higher genetic diversity in young boreal than in relict Mediterranean populations and a dual colonization of Fennoscandia. Diversity and Distributions 23: 540-555.

Ducousso, A., Guyon, J. P. and Krémer, A. 1996. Latitudinal and altitudinal variation of bud burst in western populations of sessile oak (Quercus petraea (Matt) Liebl). Annales des Sciences Forestières 53: 775-782.

Page 23: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

19

Duncan, E. J., Gluckman, P. D. and Dearden, P. K. 2014. Epigenetics, plasticity, and evolution: How do we link epigenetic change to phenotype? Journal of Experimental Zoology (Molecular and Developmental Evolution) 322B: 208-220.

Dvornyk, V., Sirviö, A., Mikkonen, M. and Savolainen, O. 2002. Low nucleotide diversity at the pal1 locus in the widely distributed Pinus sylvestris. Molecular Biology and Evolution 19: 179-188.

Eckenwalder, J. E. 2009. Conifers of the world: the complete reference. London: Timber Press. Elshire, R. J., Glaubitz, J. C., Sun, Q., Poland, J. A., Kawamoto, K., Buckler, E. S. and Mitchell,

S. E. 2011. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. Plos One 6: e19379.

Esseen, P.-A., Ehnström, B., Ericson, L. and Sjöberg, K. 1997. Boreal forests. Ecological Bulletin 46: 16-47.

Ewing, B. and Green, P. 1998. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Research 8: 186-194.

Excoffier, L. and Lischer, H. E. L. 2010. Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Molecular Ecology Resources 10: 564-567.

Excoffier, L., Smouse, P. E. and Quattro, J. M. 1992. Analysis of molecular variance inferred from metric distances among DNA haplotypes - Application to human mitochondrial DNA restriction data. Genetics 131: 479-491.

Franks, S. J. and Hoffmann, A. A. 2012. Genetics of climate change adaptation. Annual Review of Genetics 46: 185-208.

Fries, C., Johansson, O., Pettersson, B. and Simonsson, P. 1997. Silvicultural models to maintain and restore natural stand structures in Swedish boreal forests. Forest Ecology and Management 94: 89-103.

García Gil, M. R., Floran, V., Östlund, L., Mullin, T. J. and Andersson Gull, B. 2015. Genetic diversity and inbreeding in natural and managed populations of Scots pine. Tree Genetics & Genomes 11: 28.

Grivet, D., Avia, K., Vaattovaara, A., Eckert, A. J., Neale, D. B., Savolainen, O. and González-Martínez, S. C. 2017. High rate of adaptive evolution in two widespread European pines. Molecular Ecology 26: 6857-6870.

Gullberg, U., Yazdani, R., Rudin, D. and Ryman, N. 1985. Allozyme variation in Scots pine (Pinus sylvestris L.) in Sweden. Silvae Genetica 34: 193-201.

Hewitt, G. M. 1996. Some genetic consequences of ice ages, and their role in divergence and speciation. Biological Journal of the Linnean Society 58: 247-276.

Holliday, J. A., Hallerman, E. M. and Haak, D. C. 2018. Genotyping and sequencing technologies in population genetics and genomics. In Rajora, O. P. (ed.). Population Genomics: Concepts, Approaches and Applications. Cham: Springer International Publishing.

Howe, G. T., Aitken, S. N., Neale, D. B., Jermstad, K. D., Wheeler, N. C. and Chen, T. H. H. 2003. From genotype to phenotype: unraveling the complexities of cold adaptation in forest trees. Canadian Journal of Botany 81: 1247-1266.

Hurme, P., Repo, T., Savolainen, O. and Pääkkönen, T. 1997. Climatic adaptation of bud set and frost hardiness in Scots pine (Pinus sylvestris). Canadian Journal of Forest Research 27: 716-723.

Jankowski, A., Wyka, T. P., Żytkowiak, R., Nihlgård, B., Reich, P. B. and Oleksyn, J. 2017. Cold adaptation drives variability in needle structure and anatomy in Pinus sylvestris L. along a 1,900 km temperate-boreal transect. Functional Ecology 31: 2212-2223.

Kullman, L. 2008. Early postglacial appearance of tree species in northern Scandinavia: review and perspective. Quaternary Science Reviews 27: 2467-2472.

Labra, M., Grassi, F., Sgorbati, S. and Ferrari, C. 2006. Distribution of genetic variability in southern populations of Scots pine (Pinus sylvestris L.) from the Alps to the Apennines. Flora 201: 468-476.

Li, H. and Durbin, R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25: 1754-1760.

Page 24: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

20

Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R. and Subgroup, G. P. D. P. 2009. The sequence alignment/map format and SAMtools. Bioinformatics 25: 2078-2079.

Lischer, H. E. L. and Excoffier, L. 2012. PGDSpider: an automated data conversion tool for connecting population genetics and genomics programs. Bioinformatics 28: 298-299.

Logan, S. A., Phuekvilai, P., Sanderson, R. and Wolff, K. 2019. Reproductive and population genetic characteristics of leading-edge and central populations of two temperate forest tree species and implications for range expansion. Forest Ecology and Management 433: 475-486.

Lowry, D. B., Hoban, S., Kelley, J. L., Lotterhos, K. E., Reed, L. K., Antolin, M. F. and Storfer, A. 2017. Breaking RAD: an evaluation of the utility of restriction site-associated DNA sequencing for genome scans of adaptation. Molecular Ecology Resources 17: 142-152.

Matías, L. and Jump, A. S. 2012. Interactions between growth, demography and biotic interactions in determining species range limits in a warming world: The case of Pinus sylvestris. Forest Ecology and Management 282: 10-22.

Morand, M. E., Brachet, S., Rossignol, P., Dufour, J. and Frascaria-Lacoste, N. 2002. A generalized heterozygote deficiency assessed with microsatellites in French common ash populations. Molecular Ecology 11: 377-385.

Naydenov, K., Senneville, S., Beaulieu, J., Tremblay, F. and Bousquet, J. 2007. Glacial vicariance in Eurasia: mitochondrial DNA evidence from Scots pine for a complex heritage involving genetically distinct refugia at mid-northern latitudes and in Asia Minor. Bmc Evolutionary Biology 7: 233.

Nicotra, A. B., Atkin, O. K., Bonser, S. P., Davidson, A. M., Finnegan, E. J., Mathesius, U., Poot, P., Purugganan, M. D., Richards, C. L., Valladares, F. and van Kleunen, M. 2010. Plant phenotypic plasticity in a changing climate. Trends in Plant Science 15: 684-692.

Norell, L., Eriksson, G., Ekberg, I. and Dormling, I. 1986. Inheritance of autumn frost hardiness in Pinus sylvestris L. seedlings. Theoretical and Applied Genetics 72: 440-448.

Nystedt, B., Street, N. R., Wetterbom, A., Zuccolo, A., Lin, Y. C., Scofield, D. G., Vezzi, F., Delhomme, N., Giacomello, S., Alexeyenko, A., Vicedomini, R., Sahlin, K., Sherwood, E., Elfstrand, M., Gramzow, L., Holmberg, K., Hällman, J., Keech, O., Klasson, L., Koriabine, M., Kucukoglu, M., Käller, M., Luthman, J., Lysholm, F., Niittylä, T., Olson, A., Rilakovic, N., Ritland, C., Rosselló, J. A., Sena, J., Svensson, T., Talavera-López, C., Theißen, G., Tuominen, H., Vanneste, K., Wu, Z. Q., Zhang, B., Zerbe, P., Arvestad, L., Bhalerao, R., Bohlmann, J., Bousquet, J., Garcia Gil, R., Hvidsten, T. R., de Jong, P., MacKay, J., Morgante, M., Ritland, K., Sundberg, B., Thompson, S. L., Van de Peer, Y., Andersson, B., Nilsson, O., Ingvarsson, P. K., Lundeberg, J. and Jansson, S. 2013. The Norway spruce genome sequence and conifer genome evolution. Nature 497: 579-584.

Pan, J., Wang, B. S., Pei, Z. Y., Zhao, W., Gao, J., Mao, J. F. and Wang, X. R. 2015. Optimization of the genotyping-by-sequencing strategy for population genomic analysis in conifers. Molecular Ecology Resources 15: 711-722.

Parchman, T. L., Jahner, J. P., Uckele, K. A., Galland, L. M. and Eckert, A. J. 2018. RADseq approaches and applications for forest tree genetics. Tree Genetics & Genomes 14: 39.

Parducci, L., Jørgensen, T., Tollefsrud, M. M., Elverland, E., Alm, T., Fontana, S. L., Bennett, K. D., Haile, J., Matetovici, I., Suyama, Y., Edwards, M. E., Andersen, K., Rasmussen, M., Boessenkool, S., Coissac, E., Brochmann, C., Taberlet, P., Houmark-Nielsen, M., Larsen, N. K., Orlando, L., Gilbert, M. T. P., Kjær, K. H., Alsos, I. G. and Willerslev, E. 2012. Glacial survival of boreal trees in northern Scandinavia. Science 335: 1083-1086.

Prunier, J., Verta, J. P. and MacKay, J. J. 2016. Conifer genomics and adaptation: at the crossroads of genetic diversity and genome function. New Phytologist 209: 44-62.

Pyhäjärvi, T., García-Gil, M. R., Knürr, T., Mikkonen, M., Wachowiak, W. and Savolainen, O. 2007. Demographic history has influenced nucleotide diversity in European Pinus sylvestris populations. Genetics 177: 1713-1724.

Pyhäjärvi, T., Salmela, M. J. and Savolainen, O. 2008. Colonization routes of Pinus sylvestris inferred from distribution of mitochondrial DNA variation. Tree Genetics & Genomes 4: 247-254.

Page 25: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

21

Queller, D. C. and Goodnight, K. F. 1989. Estimating relatedness using genetic markers. Evolution 43: 258-275.

Raj, A., Stephens, M. and Pritchard, J. K. 2014. fastSTRUCTURE: Variational inference of population structure in large SNP data sets. Genetics 197: 573-589.

Richards, E. J. 2006. Inherited epigenetic variation - revisiting soft inheritance. Nature Reviews Genetics 7: 395-401.

Ritland, K. 1996. Estimators for pairwise relatedness and individual inbreeding coefficients. Genetics Research 67: 175-185.

Rizvanovic, M., Kennedy, J. D., Nogués-Bravo, D. and Marske, K. A. 2019. Persistence of genetic diversity and phylogeographic structure of three New Zealand forest beetles under climate change. Diversity and Distributions 25: 142-153.

Rosenberg, N. A. 2004. DISTRUCT: a program for the graphical display of population structure. Molecular Ecology Notes 4: 137-138.

Savolainen, O., Lascoux, M. and Merilä, J. 2013. Ecological genomics of local adaptation. Nature Reviews Genetics 14: 807-820.

Savolainen, O., Pyhäjärvi, T. and Knürr, T. 2007. Gene flow and local adaptation in trees. Annual Review of Ecology, Evolution, and Systematics 38: 595-619.

Schulman, A. H. 2007. Molecular markers to assess genetic diversity. Euphytica 158: 313-321. Semerikov, V. L., Semerikova, S. A., Putintseva, Y. A., Tarakanov, V. V., Tikhonova, I. V.,

Vidyakin, A. I., Oreshkova, N. V. and Krutovsky, K. V. 2018. Colonization history of Scots pine in Eastern Europe and North Asia based on mitochondrial DNA variation. Tree Genetics & Genomes 14: 8.

Sims, D., Sudbery, I., Ilott, N. E., Heger, A. and Ponting, C. P. 2014. Sequencing depth and coverage: key considerations in genomic analyses. Nature Reviews Genetics 15: 121-132.

Sinclair, W. T., Morman, J. D. and Ennos, R. A. 1999. The postglacial history of Scots pine (Pinus sylvestris L.) in western Europe: evidence from mitochondrial DNA variation. Molecular Ecology 8: 83-88.

Skogsindustrierna. 2017. Branschstatistik 2017 [Online]. Available: https://www.skogsindustrierna.se/skogsindustrin/branschstatistik/ [Accessed 2018-11-22].

Svendsen, J. I., Astakhov, V. I., Bolshiyanov, D. Y., Demidov, I., Dowdeswell, J. A., Gataullin, V., Hjort, C., Hubberten, H. W., Larsen, E., Mangerud, J., Melles, M., Möller, P., Saarnisto, M. and Siegert, M. J. 1999. Maximum extent of the Eurasian ice sheets in the Barents and Kara Sea region during the Weichselian. Boreas 28: 234-242.

Szmidt, A. E. and Muona, O. 1985. Genetic effects of Scots pine (Pinus sylvestris L.) domestication. Berlin, Heidelberg: Springer, Berlin, Heidelberg.

Taberlet, P., Fumagalli, L., Wust-Saucy, A. G. and Cosson, J. F. 1998. Comparative phylogeography and postglacial colonization routes in Europe. Molecular Ecology 7: 453-464.

Tautz, D. 1989. Hypervariability of simple sequences as a general source for polymorphic DNA markers. Nucleic Acids Research 17: 6463-6471.

The Swedish National Forest Inventory, S. S. L. 2018. Forest statistics 2018 [Online]. Available: http://skogsstatistik.slu.se/pxweb/en/OffStat/OffStat__ProduktivSkogsmark__Areal/PS_Areal_best%C3%A5ndstyper_tab.px/table/tableViewLayout2/?rxid=9f3bffba-2279-437a-824c-2af3dd8d2a21 [Accessed 2018-11-22].

Tiffin, P. and Ross-Ibarra, J. 2014. Advances and limits of using population genetics to understand local adaptation. Trends in Ecology & Evolution 29: 673-680.

Tóth, E. G., Köbölkuti, Z. A., Pedryc, A. and Höhn, M. 2017. Evolutionary history and phylogeography of Scots pine (Pinus sylvestris L.) in Europe based on molecular markers. Journal of Forestry Research 28: 637-651.

Varsamis, G., Papageorgiou, A. C., Merou, T., Takos, I., Malesios, C., Manolis, A., Tsiripidis, I. and Gailing, O. 2019. Adaptive diversity of beech seedlings under climate change scenarios. Frontiers in Plant Science 9: 1918.

Page 26: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

22

Viherä-Aarnio, A., Häkkinen, R., Partanen, J., Luomajoki, A. and Koski, V. 2005. Effects of seed origin and sowing time on timing of height growth cessation of Betula pendula seedlings. Tree Physiology 25: 101-108.

Vos, P., Hogers, R., Bleeker, M., Reijans, M., van de Lee, T., Hornes, M., Frijters, A., Pot, J., Peleman, J., Kuiper, M. and Zabeau, M. 1995. AFLP - A new technique for DNA-fingerprinting. Nucleic Acids Research 23: 4407-4414.

Wang, X. R., Szmidt, A. E. and Lindgren, D. 1991. Allozyme differentiation among populations of Pinus sylvestris (L.) from Sweden and China. Hereditas 114: 219-226.

Wegrzyn, J. L., Liechty, J. D., Stevens, K. A., Wu, L. S., Loopstra, C. A., Vasquez-Gross, H., Dougherty, W. M., Lin, B. Y., Zieve, J. J., Martinez-Garcia, P. J., Holt, C., Yandell, M., Zimin, A., Yorke, J. A., Crepeau, M., Puiu, D., Salzberg, S. L., de Jong, P., Mockaitis, K., Main, D., Langley, C. H. and Neale, D. B. 2014. Unique features of the Loblolly pine (Pinus taeda L.) megagenome revealed through sequence annotation. Genetics 196: 891-909.

Weir, B. S. and Cockerham, C. C. 1984. Estimating F-statistics for the analysis of population structure. Evolution 38: 1358-1370.

Whitelaw, N. C. and Whitelaw, E. 2006. How lifetimes shape epigenotype within and across generations. Human Molecular Genetics 15: R131-R137.

Wójkiewicz, B., Litkowiec, M. and Wachowiak, W. 2016. Contrasting patterns of genetic variation in core and peripheral populations of highly outcrossing and wind pollinated forest tree species. AoB Plants 8: plw054.

Wu, J., Krutovskii, K. V. and Strauss, S. H. 1999. Nuclear DNA diversity, population differentiation, and phylogenetic relationships in the California closed-cone pines based on RAPD and allozyme markers. Genome 42: 893-908.

Xia, H. H., Wang, B. S., Zhao, W., Pan, J., Mao, J. F. and Wang, X. R. 2018. Combining mitochondrial and nuclear genome analyses to dissect the effects of colonization, environment, and geography on population structure in Pinus tabuliformis. Evolutionary Applications 11: 1931-1945.

Yeaman, S., Hodgins, K. A., Lotterhos, K. E., Suren, H., Nadeau, S., Degner, J. C., Nurkowski, K. A., Smets, P., Wang, T. L., Gray, L. K., Liepe, K. J., Hamann, A., Holliday, J. A., Whitlock, M. C., Rieseberg, L. H. and Aitken, S. N. 2016. Convergent local adaptation to climate in distantly related conifers. Science 353: 1431-1433.

Zimin, A., Stevens, K. A., Crepeau, M., Holtz-Morris, A., Koriabine, M., Marçais, G., Puiu, D., Roberts, M., Wegrzyn, J. L., de Jong, P. J., Neale, D. B., Salzberg, S. L., Yorke, J. A. and Langley, C. H. 2014. Sequencing and assembly of the 22-Gb Loblolly pine genome. Genetics 196: 875-890.

Zimmer, K. and Sønstebø, J. H. 2018. A preliminary study on the genetic structure of Northern European Pinus sylvestris L. by means of neutral nuclear microsatellite markers. Scandinavian Journal of Forest Research 33: 6-13.

Östlund, L. 2004. Fire, death and disorder in the forest: 150 years of change in critical ecological structures and processes in boreal Scandinavia. In Honnay, O., Verheyen, K. and Bossuyt, B. (eds.). Forest Biodiversity: Lessons from History for Conservation. Oxfordshire: CABI Publishing.

Øyen, B. H., Blom, H. H., Gjerde, I., Myking, T., Sætersdal, M. and Thunes, K. H. 2006. Ecology, history and silviculture of Scots pine (Pinus sylvestris L.) in western Norway - a literature review. Forestry 79: 319-329.

Page 27: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

23

Appendix Appendix 1. Pairwise FST measured between all populations.

4

45

54

11

35

28

53

17

d5

d6

0

d2

1

48

13

2

42

20

10

41

21

39

5

37

25

26

31

29

0.0

06

06

0.0

09

66

0.0

015

8

0.0

05

42

0.0

05

56

0.0

04

67

0.0

06

96

0.0

05

90

0.0

07

56

0.0

08

53

0.0

109

8

0.0

04

90

0.0

02

26

0.0

09

51

0.0

07

98

0.0

06

58

0.0

06

09

0.0

0110

0.0

06

06

0.0

00

48

0.0

017

0

0.0

07

31

0.0

08

08

0.0

019

3

0.0

07

18

0

29

0.0

08

96

0.0

138

8

0.0

08

04

0.0

133

1

0.0

126

0

0.0

127

0

0.0

150

8

0.0

122

8

0.0

09

79

0.0

09

03

0.0

104

3

0.0

09

67

0.0

04

60

0.0

1012

0.0

08

30

0.0

08

89

0.0

06

75

0.0

07

27

0.0

08

69

0.0

07

24

0.0

0112

0.0

04

62

0.0

08

75

0.0

07

28

0

31

0.0

04

75

0.0

09

95

0.0

02

12

0.0

07

97

0.0

05

09

0.0

06

77

0.0

08

51

0.0

07

74

0.0

07

95

0.0

07

87

0.0

1177

0.0

03

73

0.0

018

5

0.0

107

7

0.0

102

7

0.0

07

54

0.0

05

54

0.0

02

39

0.0

04

18

0.0

02

63

0.0

013

9

0.0

08

70

0.0

06

27

0

26

0.0

07

96

0.0

136

7

0.0

147

5

0.0

1517

0.0

1415

0.0

169

5

0.0

168

8

0.0

126

3

0.0

05

27

0.0

09

76

0.0

07

62

0.0

07

89

0.0

02

82

0.0

05

08

0.0

03

47

0.0

00

48

0.0

017

6

0.0

07

66

9

0.0

05

48

0.0

09

31

0.0

06

81

0.0

05

92

0

25

0.0

136

6

0.0

179

9

0.0

06

89

0.0

157

4

0.0

128

7

0.0

149

2

0.0

173

6

0.0

159

3

0.0

08

85

0.0

108

9

0.0

08

80

0.0

09

45

0.0

04

20

0.0

07

50

0.0

07

18

0.0

07

09

0.0

05

91

0.0

09

91

0.0

03

75

0.0

06

92

0.0

0112

0

37

0.0

05

61

0.0

09

53

0.0

06

51

0.0

09

31

0.0

09

56

0.0

09

83

0.0

123

1

0.0

06

60

0.0

06

14

0.0

03

98

0.0

04

97

0.0

05

65

0.0

00

16

0.0

06

52

0.0

04

71

0.0

04

75

0.0

02

30

0.0

03

06

0.0

06

00

0.0

03

81

0

5

0.0

05

22

0.0

104

8

0.0

06

19

0.0

08

95

0.0

07

10

0.0

08

66

0.0

08

74

0.0

04

83

0.0

05

39

0.0

05

87

0.0

06

91

0.0

03

31

0.0

016

9

0.0

08

77

0.0

05

70

0.0

03

77

0.0

03

13

0.0

02

28

0.0

08

38

0

39

0.0

06

70

0.0

126

6

0.0

125

6

0.0

146

5

0.0

1175

0.0

152

0

0.0

189

8

0.0

1117

0.0

05

11

0.0

06

80

0.0

04

45

0.0

05

35

0.0

010

8

0.0

05

02

0.0

04

11

0.0

02

58

0.0

00

33

0.0

07

26

0

21

0.0

03

40

0.0

07

46

0.0

00

82

0.0

06

31

0.0

03

26

0.0

06

34

0.0

102

2

0.0

05

59

0.0

07

61

0.0

08

38

0.0

123

2

0.0

02

91

0.0

00

73

0.0

09

71

0.0

09

89

0.0

07

75

0.0

05

41

0

41

0.0

05

83

0.0

1181

0.0

02

43

0.0

09

92

0.0

04

89

0.0

104

9

0.0

134

2

0.0

107

0

0.0

02

39

0.0

105

2

0.0

09

35

0.0

03

17

0.0

0114

0.0

05

50

0.0

04

77

0.0

02

79

0

10

0.0

07

46

0.0

137

3

0.0

04

87

0.0

1112

0.0

06

47

0.0

1130

0.0

122

6

0.0

120

7

0.0

03

11

0.0

1150

0.0

09

11

0.0

03

97

0.0

018

3

0.0

02

22

0.0

03

98

0

20

0.0

104

9

0.0

162

4

0.0

08

82

0.0

140

1

0.0

104

0

0.0

145

0

0.0

160

8

0.0

152

1

0.0

06

50

0.0

124

8

0.0

105

9

0.0

08

66

0.0

02

71

0.0

02

80

0

42

0.0

102

3

0.0

163

7

0.0

09

93

0.0

142

7

0.0

1194

0.0

152

9

0.0

.016

97

0.0

155

2

0.0

05

92

0.0

1189

0.0

09

83

0.0

08

20

0.0

04

73

0

2

Page 28: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern

24

Pairwise FST measured between all populations (continued).

4

45

54

11

35

28

53

17

d5

d6

0

d2

1

48

13

2

42

20

10

41

21

39

5

37

25

26

31

29

0.0

02

74

0.0

02

74

0.0

08

40

0.0

04

83

0.0

08

08

0.0

03

78

0.0

08

63

0.0

04

69

0.0

00

67

0.0

03

21

0.0

03

86

0.0

016

5

0

13

0.0

04

03

0.0

09

87

0.0

05

99

0.0

128

8

0.0

05

39

0.0

1169

0.0

125

5

0.0

08

47

0.0

04

27

0.0

05

82

0.0

07

75

0

48

0.0

128

0

0.0

179

0

0.0

05

97

0.0

142

8

0.0

108

2

0.0

136

9

0.0

158

5

0.0

173

5

0.0

08

96

0.0

00

92

0

d2

1

0.0

09

21

0.0

142

2

0.0

03

21

0.0

1182

0.0

07

93

0.0

1012

0.0

135

2

0.0

133

4

0.0

09

26

0

d6

0

0.0

08

24

0.0

144

1

0.0

06

52

0.0

1186

0.0

07

73

0.0

130

0

0.0

138

6

0.0

126

7

0

d5

0.0

015

1

0.0

010

7

0.0

00

82

0.0

00

09

0.0

016

6

0.0

00

32

0.0

00

71

0

17

0.0

05

74

0.0

03

34

0.0

07

11

0.0

018

6

0.0

05

68

0.0

00

45

0

53

0.0

03

00

0.0

02

16

0.0

03

48

0.0

00

51

0.0

05

67

0

28

0.0

02

06

0.0

04

3

0.0

06

75

0.0

03

99

0

35

0.0

02

87

0.0

02

44

0.0

04

37

0

11

-0.0

014

4

0.0

03

02

0

54

0.0

02

37

0

45

0

4

Page 29: Genetic diversity and hardiness in Scots pine from ...umu.diva-portal.org/smash/get/diva2:1325184/FULLTEXT01.pdf · Asia (~138°E), and latitudinally from Spain (~37°N) to northern