Meta-analysis of ADMIXTURE results Perry, Petros & students

Preview:

Citation preview

Meta-analysis of ADMIXTURE results

Perry, Petros & students

Objective of our analysis

Based on the output of ADMIXTURE, we want to quantify the amount of shared ancestry between two populations.

Even more, we would like to answer questions of the form: “How much ancestry is shared between population X and Y, on top of the ancestry that population X shares with population Z?”

This led us to a linear algebraic approach for the meta-analysis of ADMIXTURE results.

An overview of our analysis

• We always start with the output of ADMIXTURE for some value of K (the number of ancestral populations).

• We summarize population X using the top eigenvector of its ADMIXTURE coefficients for that particular value of K; more than one eigenvectors are used if the variance captured by the top eigenvector is not sufficient (at this point, we seek to summarize a population X by using enough eigenvectors to capture at least 90% of its variance).

• We measure the amount of variance of the population Y that is captured by the eigenvectors that form the summary of population X; we consider this output to be the amount of shared variance.

• We use standard linear algebraic metrics (Frobenius norm) to measure the amount of captured variance.

• Standard vector space algebra allows us to address queries of the form: given population Y – Z (the population Y minus the contribution of population Z), compute the variance that Y-Z captures for population X.

• The approach is reasonably robust to noise and outliers; bootstrap is possible to get confidence intervals.

• This is a meta-analysis of ADMIXTURE results --- kind of analogous to PCAdmix.

Example: an analysis of Crete, Veneto, Tuscany, Italy

ADMIXTURE meta-analysis, K=5.

Tuscan Crete: 8.9%

Italian Crete: 4.2%

Veneto Crete: 5.2%

Veneto - Italian Crete: 0.1%

Veneto - Tuscan Crete: 0.9%

Veneto - (Italian + Tuscan) Crete: 0.1%

X Y, indicates the amount of shared ancestry from population X to YX – Z Y, indicates the amount of shared ancestry from population X to Y, minus amount of ancestry that is already captured by population Z.

Example: an analysis of Crete, French, Basque, Andalusia

ADMIXTURE meta-analysis, K=5.

Basque Crete: 0.2%

Andalusia Crete: 7.0%

French Crete: 1.4%

Andalusia - French Crete: 3.1%

Andalusia - Basque Crete: 6.4%

Andalusia - (French + Basque) Crete: 2.8%

X Y, indicates the amount of shared ancestry from population X to YX – Z Y, indicates the amount of shared ancestry from population X to Y, minus amount of ancestry that is already captured by population Z.

Example: an analysis of Crete and Middle Eastern/Arab populations, v1

ADMIXTURE meta-analysis, K=5.

Syrians Crete: 17.5%

Iranians Crete: 1.8%

Jordanians Crete: 6.3%

Lebanese Crete: 6.5%

Saudis Crete: 0.4%

Syrians - Saudis Crete: 15.6%

Lebanese - Saudis Crete: 5.5%

Iranians - Saudis Crete: 1.5%

X Y, indicates the amount of shared ancestry from population X to YX – Z Y, indicates the amount of shared ancestry from population X to Y, minus amount of ancestry that is already captured by population Z.

Example: an analysis of Crete and Middle Eastern/Arab populations, v2

ADMIXTURE meta-analysis, K=5.

Syrians Crete: 6.5%

Jordanians Crete: 1.2%

Lebanese Crete: 17.4%

Kurds Crete: 21.0%

Lebanese - (Kurds+Syrians) Crete: 0.0%

Syrians - (Kurds+Lebanese) Crete: 0.0%

X Y, indicates the amount of shared ancestry from population X to YX – Z Y, indicates the amount of shared ancestry from population X to Y, minus amount of ancestry that is already captured by population Z.

Example: an analysis of Crete and Middle Eastern/Arab populations, v3

ADMIXTURE meta-analysis, K=5.

Syrians Crete: 79.7%

Jordanians Crete: 56.0%

Lebanese Crete: 72.5%

Kurds Crete: 93.0%

Palestinian Crete: 6.6%

Lebanese - (Kurds+Syrians+Palestinian) Crete: 0.0%

Syrians - (Kurds+Lebanese+Palestinian) Crete: 0.0%

Remark: the Palestinians have a lot of genetic diversity within themselves; as a result, ADMIXTURE (at K=5) fails to capture the difference between – say – Cretans and Kurds and captures instead the genetic diversity of the Palestinians.

X Y, indicates the amount of shared ancestry from population X to YX – Z Y, indicates the amount of shared ancestry from population X to Y, minus amount of ancestry that is already captured by population Z.