MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSES

Preview:

DESCRIPTION

MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSES

Citation preview

MICROSATELLITE MARKERS FOR LIVESTOCK GENETIC DIVERSITY ANALYSES

Karan Veer Singh

National Bureau of Animal Genetic Resources Karnal-132001

 

About 40 species of domestic animals and poultry contribute to meeting the needs of humankind, providing meat, fibre, milk, eggs, draught animal power, skins, and manure, and are an essential component of many mixed farming systems.

Within these species, more than 8000 breeds and strains (FAO, 2011) constitute the animal genetic resources (AnGR) that are of crucial significance for food and agriculture.

According to the report on the Status and trends of animal genetic resources – 2010 (FAO, 2011), approximately 8 percent of reported livestock breeds have become extinct and an additional 21 percent are considered to be at risk of extinction. Moreover, the situation is presently unknown for 35 percent of breeds, most of which are reared in developing countries.

FAO. 2011. Status and trends of animal genetic resources – 2010. Commission on Genetic Resources for Food and Agriculture, Thirteenth Regular Session, Rome, 18–22 July 2011, (CGRFA-13/11/Inf.17). Rome (available at http://www.fao.org/docrep/meeting/022/am649e.pdf).

About 40 species of domestic animals and poultry contribute to meeting the needs of humankind, providing meat, fibre, milk, eggs, draught animal power, skins, and manure, and are an essential component of many mixed farming systems.

Within these species, more than 8000 breeds and strains (FAO, 2011) constitute the animal genetic resources (AnGR) that are of crucial significance for food and agriculture.

According to the report on the Status and trends of animal genetic resources – 2010 (FAO, 2011), approximately 8 percent of reported livestock breeds have become extinct and an additional 21 percent are considered to be at risk of extinction. Moreover, the situation is presently unknown for 35 percent of breeds, most of which are reared in developing countries.

FAO. 2011. Status and trends of animal genetic resources – 2010. Commission on Genetic Resources for Food and Agriculture, Thirteenth Regular Session, Rome, 18–22 July 2011, (CGRFA-13/11/Inf.17). Rome (available at http://www.fao.org/docrep/meeting/022/am649e.pdf).

LIVESTOCK DIVERSITY

LIVESTOCK DIVERSITY IN INDIASpecies No. of recognized

breeds

Buffalo 13

Cattle 37

Sheep 39

Goat 23

Camel 8

Horse/Pony 6

Poultry 15

Pig 2

Donkey 1Yak, Mithun, ducks, geese and other non descript populations

It is estimated that 50% of indigenous goats, 27% of indigenous sheep, 20% of indigenous cattle and 26% poultry breeds in India are threatened.

• Conservation of indigenous breeds has received little attention in the country.

• No serious efforts are made for conservation of the breeds at risk.

• Lack of basic descriptive information on animal genetic resources.

• Replacement of Indigenous breeds by exotic or crossbreds.• Shifting of traditional farming to commercial farming.

REASONS FOR DECLINE IN DOMESTIC ANIMAL BIODIVERSITY

Livestock Breed analysis/characterization requires knowledge of genetic variation.

Genetic variation be effectively measured within and between populations.

Various types of markers are available to assess such genetic variations/polymorphism.

LIVESTOCK GENETIC ANALYSIS

MOLECULAR/DNA MARKERS

Any DNA fragment or gene coding for a trait which is free of environmental effect and does not interact with other genes or alleles, is called a DNA marker.Viz. RAPD, SSR, RFLP, AFLP etc.

Typical characteristics• Not affected by environment or the developmental stage• Not tissue /organ/sex specific• More efficient than protein or biochemical polymorphism• More informative• Explore complete genome and show Mendelian inheritance

• Litt and Luty 1989 (Am. J. Hum. Gen.)

• Sequences of DNA consisting of repeats of 2-6 base pair motifs, almost any combination possible (e.g. CA, GA, GGGAA). Polymorphisms are based on number of repeat units and are hypervariable (have many alleles)

• Litt and Luty 1989 (Am. J. Hum. Gen.)

• Sequences of DNA consisting of repeats of 2-6 base pair motifs, almost any combination possible (e.g. CA, GA, GGGAA). Polymorphisms are based on number of repeat units and are hypervariable (have many alleles)

MICROSATELLITE/SSR MARKERS

Microsatellites are also known as

• simple sequence repeats (SSR),

• short tandem repeats (STR)

SYNONYMS

Mononucleotide - (A)11AAAAAAAAAAA

Dinucleotide - (GT)6GTGTGTGTGTGT

Trinucleotide - (CTG)4CTGCTGCTGCTG

Tetranucleotide - (ACTC)4

ACTCACTCACTCACTC

Mononucleotide - (A)11AAAAAAAAAAA

Dinucleotide - (GT)6GTGTGTGTGTGT

Trinucleotide - (CTG)4CTGCTGCTGCTG

Tetranucleotide - (ACTC)4

ACTCACTCACTCACTC

REPEAT STRUCTURE OF MICROSATELLITES

TYPES OF MICROSATELLITES BASED ON THE NATURE OF REPEATS

POLYMORPHISM

the repeat region is variable between samples while the flanking regions where PCR primers bind are constant

7 repeats

8 repeats

AATG

Homozygote = both alleles are the same length

Heterozygote = alleles differ and can be resolved from one another

How do microsatellites evolve?

Unequal crossing-over during meiosis

Replication Slippage

• Mutation  • It is estimated that microsatellites mutate 100 to 10,000 times as fast as base pair substitutions. 

• Mutation  • It is estimated that microsatellites mutate 100 to 10,000 times as fast as base pair substitutions. 

EVOLUTION OF MICROSATELLITES

How do microsatellites mutate?

DNA polymerase slippage Unequal crossing over

• Microsatellites alleles change rather quickly over time E. coli – 10-2 events per locus per replication Drosophila – 6 X 10-6 events per locus per generation Human – 10-3 events per locus per generation

MICROSATELLITES - TOOLS OF CHOICE Low quantities of template DNA required (10-100

ng)

High genomic abundance

Random distribution throughout the genome

High level of polymorphism

Band profiles can be interpreted in terms of loci and alleles

Codominance of alleles

Allele sizes can be determined with an accuracy of 1 bp, allowing accurate comparison across different gels

High reproducibility

Different STRs may be multiplexed in PCR or on gel

Wide range of applications

Amenable to automation

Stutter Bands in SSRStutter Bands in SSR

Often there are minor bands in addition to the major bands. These minor bands are called stutter bands (shadow bands) and they usually differ (smaller in size) from the major bands by a few nucleotides.

Homology vs. Homoplasy

• Homology is any similarity between characters that is due to their shared ancestry

• Homoplasy occurs when characters are similar, but are not derived from a common ancestor.

.

HOW DO WE DEVELOP MICROSATELLITE PRIMERS?

DNA Extraction Digestion of genomic DNA with Restriction Enzymes Cloning the resulting fragments into suitable cloning vectors to form

genomic library Plating these cloning vectors on nylon membrane Probe the membrane with labeled oligonucleotides of desirable repeats Culture the positive clones Cut the insert out and run on agarose gel Sequence the positive clones and design the appropriate primers from

flanking regions

DNA Extraction Digestion of genomic DNA with Restriction Enzymes Cloning the resulting fragments into suitable cloning vectors to form

genomic library Plating these cloning vectors on nylon membrane Probe the membrane with labeled oligonucleotides of desirable repeats Culture the positive clones Cut the insert out and run on agarose gel Sequence the positive clones and design the appropriate primers from

flanking regions

WHAT ARE MICROSATELLITES FOR?

• Microsatellites are “junk” DNA. In humans, 90% of microsatellites are found in noncoding regions of the genome.

• Microsatellites may provide a source of genetic variation. In bacteria, variation in microsatellites alleles in coding regions is thought to be adaptive in different environments.

• Microsatellites may help regulate gene expression.

• Forensics and parentage analysis

• Disease diagnosis

• Diversity analysis

• Population Studies

• Conservation Biology

APPLICATIONS

• ForensicsBecause microsatellites are so variable, by studying several at one time (and getting a DNA fingerprint), individuals can be identified.

• Paternity studiesBecause individuals receive one allele from their mother and one from their father, paternity (or maternity) can be determined

• ForensicsBecause microsatellites are so variable, by studying several at one time (and getting a DNA fingerprint), individuals can be identified.

• Paternity studiesBecause individuals receive one allele from their mother and one from their father, paternity (or maternity) can be determined

Exclusion of false parents with a probability of as high as 99.999 % against 40 – 60 % from biochemical markers

PARENTAGE VERIFICATION

Disease Diagnosis – Huntington’s disease

Huntington's disease is caused by a genetic defect on chromosome 4. The defect causes a part of DNA, called a CAG repeat, to occur many more times than it is supposed to. Normally, this section of DNA is repeated 10 to 28 times. But in persons with Huntington's disease, it is repeated 36 to 120 times.

DIVERSITY ANALYSIS

• Observed heterozygosity (Ho) and gene diversity or expected heterozygosity (He) are measures of genetic diversity within a population.

• Allelic polymorphisms in a population.

Genetic variability between & within breeds- through genetic distancing and heterozygosity to look into the effects of

• Bottlenecks suffered by a breed

• Inbreeding depressions due to declining population

Relationship among breeds

• Helps in finding the most diverse groups

• Helps to decide about the conservation programs

INTRASPECIFIC (WITHIN SPECIES)

Allelic Patterns across Populations

0

2

4

6

8

10

12

Populations

Me

an

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

He

tero

zyg

osity

Na No. Private Alleles He

DENDROGRAM BASED ON NEI’S STANDARD GENETIC DISTANCE (Ds)

Radiation tree using individual animals as taxonomic units constructed with a distance matrix with simple allele sharing statistics.

Average membership coefficient (q) for each given breed for k=15 clustering result

• To study relatedness– through Phylogenetics

• Reconstruction of the evolutionary relationships among the organisms

• To study cross-species homologies for both coding and non-coding sequences for construction of comparative maps

INTER-SPECIFIC LEVEL (BETWEEN CLOSELY RELATED SPECIES)

CONSERVATION BIOLOGY

In order to plan a conservation management strategy, it is necessary to define, record and assess the genetic resources at risk.

Full description or characterization of animal genetic resources is essential at the level of comparative molecular description for which microsatellite markers can be used to establish which breed harbor significant genetic diversity in order to better target conservation action.

In order to plan a conservation management strategy, it is necessary to define, record and assess the genetic resources at risk.

Full description or characterization of animal genetic resources is essential at the level of comparative molecular description for which microsatellite markers can be used to establish which breed harbor significant genetic diversity in order to better target conservation action.

Which breeds should be prioritized for economically viable conservation plans?

Weitzman Diversity

Rampur Bushair, 3.65Chokla, 7.53

Magra, 1.85Nali, 1.83

Marwari, 3.35

Jaisalmeri, 2.35

Pugal, 2.65

Patanwadi, 4.28

Sonadi, 9.68Kheri, 2.5Malpura, 2.1Muzzafarnagri, 4.98

Jalauni, 2.88

Ganjam, 7.03

Chhotanagpuri, 6.35

Garole, 11.3

Deccani, 1.85Madgyal, 6.35

The marginal diversity reflects the change of diversity in the whole population in case of an increase in the extinction probability of one breed.

IMPLICATIONS The overall magnitude of genetic diversity within each livestock species The genetic relationships, expressed as genetic distances among breeds, within each species. allow for interpretation of gene flow in animal populations, which might be related to human migrations possibly give some indication of levels of inbreeding in each breed enhance the global information system on domestic animal diversity, and consequently the development of more effective and efficient conservation programmes alert national governments of the need to better characterize and conserve the indigenous animal genetic resources, and guide in the establishment of sound policies and sustainable agriculture.

Three main steps are involved in the statistical analysis of molecular data in diversity studies:

• Data collection

• Data analysis

• Interpretation of the data

ANALYSIS OF MICROSATELLITE DATA

http://www.fao.org/docrep/014/i2413e/i2413e00.htm

Data collection

• Sample collection• DNA isolation• PCR amplification• Checking of PCR products• Resolution and Visualization of different alleles by PAGE, silver staining, autoradiography or by automated sequencer

• Any of the biological materials like fresh blood, tissue, hair, bone etc. may potentially be used for DNA analysis.

•Sample should be collected from unrelated animals by visiting the breeding tract of the breed in question and not more than 10 % of any one herd or village population should be sampled. Whenever possible, pedigree records should be consulted for identifying unrelated individuals.

• To achieve clearer differentiation among closely related populations/ breeds, it is recommended that per breed 50 unrelated animals (preferably 25 each of both the sexes) should be assayed .

Sampling Procedure

DNA Extraction

•The collected blood samples in vacutainer tubes containing anticoagulant such as EDTA are transported to the laboratory under chilled condition for further processing. •Genomic DNA from total blood is then isolated using proteinase-K digestion followed by standard phenol/ chloroform extraction. •Both the quality as well as quantity of isolated genomic DNA is assessed and subsequently stored at –200C/40C for further analysis with microsatellite markers.

8

8-1

8-4

The number of repeats can be determined by separating microsatellites by size using electrophoresis.

Capillary ElectrophoresisGel Electrophoresis

DETECTION

1. Radioactive (P33) end-labelling

AB C

DF

E

1 DD2 BB3 CC4 CF5 AC

A

B C

D FE

2. Silver staining

6% urea PAGE showing microsatellite polymorphism

BM6526

Entry of band/allele information into the computer. It can be done manually or it can be read from gel directly by a computer installed with software.

Multiplex PCR (Parallel Sample Processing)

Compatible primers are the key to successful multiplex PCR

10 or more STR loci can be simultaneously amplified

Advantages of Multiplex PCR–Increases information obtained per unit time (increases power of discrimination)–Reduces labor to obtain results–Reduces template required (smaller sample consumed)

Challenges to Multiplexing–primer design to find compatible primers (no program exists)–reaction optimization is highly empirical often taking months

Each individual can be genotyped manually by scoring the band (alleles) as two digits or as their interger size in base pair in which case heterozygous individuals yield two bands and those that are homozygous yield one band.

A. Because humans are diploid organisms, each individual has two alleles per locus.

B. Individuals could be:

1. Homozygous—two copies of the same overall length

2. Heterozygous—two copies of different overall length.

C. Many alleles exist in a population with the maximum number of alleles being two times the number of people in the population.

GENOTYPING

Statistical Parameters for estimation of the Variability

• Heterozygosity

• Polymorphism Information Content (PIC)

• Genetic Distances

• Divergence times

• Probability of individual identification

• Probability of exclusion of false parents

• Heterozygosity

• Polymorphism Information Content (PIC)

• Genetic Distances

• Divergence times

• Probability of individual identification

• Probability of exclusion of false parents

Allele number Alleles are a set of alternative forms of the same gene occupying the same relative position or locus on homologous chromosomes. Allele number is the total number of alleles for a given marker / locus in a population, which is counted with a non-zero frequency. The allele number for each locus can be determined manually from the silver stained gels/autoradiograms.

Allele number Alleles are a set of alternative forms of the same gene occupying the same relative position or locus on homologous chromosomes. Allele number is the total number of alleles for a given marker / locus in a population, which is counted with a non-zero frequency. The allele number for each locus can be determined manually from the silver stained gels/autoradiograms.

Statistical Analysis of Data

The frequency of an allele ‘A’ is the number of ‘A’ alleles in the population divided by the total number of alleles/genes.

It gives an indication of the most or least prevalent alleles in the population.

The allele frequency is affected over time by forces such as genetic drift, mutation and migration.

The frequency of an allele ‘A’ is the number of ‘A’ alleles in the population divided by the total number of alleles/genes.

It gives an indication of the most or least prevalent alleles in the population.

The allele frequency is affected over time by forces such as genetic drift, mutation and migration.

Allele Frequency

Heterozygosity is the state of possessing different alleles at a given locus in regard to a given character. It is a measure of heterozygotes or genic variation in a population. The population heterozygosity at a locus is given by the formula:

H = 1 – Pi2

where ∑ stands for summation over all alleles (Nei, 1978) and Pi is the

frequency of the ith allele at a locus in a population. The average heterozygosity per locus (H) is defined as the mean of H over all structural loci in the genome. However, the unbiased estimate of the expected heterozygosity at a locus is (if N < 50):   

HE = pi2

12N

2N 1

n

i=1HE = pi

21 pipi

21

2N

2N 12N 1

n

i=1

Heterozygosity

The polymorphism information content is another important measure of DNA polymorphism. Expected value of PIC for each locus is calculated as per (Botstein et al., 1980):

n n-1 n

PIC = 1 - Σ pi2 - Σ Σ 2 pi

2 pj2

i=1 i=1 j=i+1

Polymorphism Information Content (PIC)

Genetic Distancing • Genetic distance expresses the genetic differences between two

populations as a single number.

• It is the basis for constructing phylogenetic trees

• Different sets of data require different kinds of distance measures.

• The different models are based on different assumptions each differing in certain assumptions of population divergence, and the basis of the estimation of breed relationship (co ancestry ‑coefficient, proportion of shared number of alleles, probability of gene identity between two populations).

Methods of genetic distancing

• Nei's (1972) standard genetic distance• Average square distance (Goldstein et al., 1995)• Delta mu squared (δμ)2 distance (Goldstein et al., 1995) • Reynold's genetic distance (Reynold et al., 1983)• Slatkin's (1995) genetic distance (Rst) • Cavalli-Sforza and Bodmer's (1971) kinship coefficient distance (Dkf) • Proportion of shared alleles distance (Dps) (Bowcock et al., 1994)• Cavalli-Sforza and Edwards (1967) chord distance (Dc)

Molecular data processing

1.GenAlex2.POPGENE3.GDA (Genetic Data

Analysis) 4.GENEPOP 5.Phylip 6.Microsat 7.TreeView8.FSTAT9.BOTTLENECK10.STRUCTURE

• Each SNP is less informative - Because only has two alleles• Need to genotype more SNPs to equal distinctive DNA profile Computationally: 25 to 45 SNPs equal 13 core STR loci Actual lab work: 50 or more SNPs equal 12 STRs

SNPs vs STR

Recommended