Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Bayesian inference of biogeographical histories for hundreds of discrete areas
Michael Landis Nick Matzke Brian Moore
John Huelsenbeck
Evolu>on 06/23/13
Biogeography
“Every species has come into existence coincident both in space and 5me with a pre-‐exis5ng closely allied species.”
AR Wallace, 1855
Con>nental-‐scale biogeography
Octodon degus Photo by José Cañas
(Mol Phylo Evol 2012)
Cuniculus taczanowskiiCuniculus pacaGalea musteloidesCavia apereaCavia porcellusCavia tschudiiMicrocavia australisDolichotis patagonumKerodon rupestrisHydrochoerus hydrochaerisDasyprocta leporinaMyoprocta acouchyCoendou bicolorErethizon dorsatumSphiggurus melanurusChinchilla lanigeraLagidium viscaciaLagostomus maximusDinomys branickiiEuryzygomatomys spinosusClyomys laticepsTrinomys setosusTrinomys paratusTrinomys eliasiTrinomys yonenagaeTrinomys iheringiTrinomys dimidiatusCapromys piloridesMyocastor coypusThrichomys apereoidesHoplomys gymnurusProechimys quadruplicatusProechimys simonsiProechimys longicaudatusProechimys robertiKannabateomys amblyonyxDactylomys boliviensisDactylomys dactylinusLonchothrix emiliaeMesomys hispidusMesomys occultusEchimys chrysurusToromys grandisPhyllomys blainvilliiMakalata didelphoidesPhyllomys brasiliensisMakalata macruraIsothrix barbarabrownaeIsothrix bistriataIsothrix sinnamariensisCtenomys steinbachiCtenomys boliviensisCtenomys haigiTympanoctomys barreraePipanacoctomys aureusOctomys mimaxSpalacopus cyanusAconaemys fuscusAconaemys sageiAconaemys porteriOctodon degusOctodon lunatusOctodon bridgesiOctodontomys gliroidesAbrocoma bennettiiAbrocoma cinerea
A B C D E F G H I
A
B
C
D
E
FG
HI
Supplemental Figure 1
a) b)
Global-‐scale biogeography
For 8 areas
For 80 areas
For 800 areas
For 8 zillion areas
Cuniculus taczanowskiiCuniculus pacaGalea musteloidesCavia apereaCavia porcellusCavia tschudiiMicrocavia australisDolichotis patagonumKerodon rupestrisHydrochoerus hydrochaerisDasyprocta leporinaMyoprocta acouchyCoendou bicolorErethizon dorsatumSphiggurus melanurusChinchilla lanigeraLagidium viscaciaLagostomus maximusDinomys branickiiEuryzygomatomys spinosusClyomys laticepsTrinomys setosusTrinomys paratusTrinomys eliasiTrinomys yonenagaeTrinomys iheringiTrinomys dimidiatusCapromys piloridesMyocastor coypusThrichomys apereoidesHoplomys gymnurusProechimys quadruplicatusProechimys simonsiProechimys longicaudatusProechimys robertiKannabateomys amblyonyxDactylomys boliviensisDactylomys dactylinusLonchothrix emiliaeMesomys hispidusMesomys occultusEchimys chrysurusToromys grandisPhyllomys blainvilliiMakalata didelphoidesPhyllomys brasiliensisMakalata macruraIsothrix barbarabrownaeIsothrix bistriataIsothrix sinnamariensisCtenomys steinbachiCtenomys boliviensisCtenomys haigiTympanoctomys barreraePipanacoctomys aureusOctomys mimaxSpalacopus cyanusAconaemys fuscusAconaemys sageiAconaemys porteriOctodon degusOctodon lunatusOctodon bridgesiOctodontomys gliroidesAbrocoma bennettiiAbrocoma cinerea
A B C D E F G H I
A
B
C
D
E
FG
HI
Supplemental Figure 1
a) b)
13,264 occurrences available (GBIF)
86 occurrences used (Upham & PaYerson, 2012)
Why 9 areas?
Transi>on between two ranges
Ancestral Observed & extant
Founda>onal work: Ree et al. (Evolu5on 2005) Ree & Smith (Syst Biol 2008)
Range
Character
>me
Transi>on probability
QInstantaneous rate matrix
Matrix exponen>a>on accounts for all intermediate events.
For few areas, no problem 3 areas
For more areas, explodes Q
3 areas 10 areas
210 ⇥ 210 = 1024⇥ 1024
Matrix exponen>a>on too slow for more than ten areas.
Download BayArea: bayarea.googlecode.com
Landis et al. (Syst Biol, in press)
BayArea: Method for more areas
BayArea: Method for more areas
Inspired by Robinson et al. (Mol Biol Evol 2003)
Landis et al. (Syst Biol, in press)
BayArea: Method for more areas
Inspired by Robinson et al. (Mol Biol Evol 2003) Key concepts: 1. Sample biogeographic histories, H
Landis et al. (Syst Biol, in press)
BayArea: Method for more areas
Inspired by Robinson et al. (Mol Biol Evol 2003) Key concepts: 1. Sample biogeographic histories, 2. Compute likelihood, L�,H
H
Landis et al. (Syst Biol, in press)
BayArea: Method for more areas
Inspired by Robinson et al. (Mol Biol Evol 2003) Key concepts: 1. Sample biogeographic histories, 2. Compute likelihood, 3. Approximate using
Markov chain Monte Carlo (MCMC)
L�,H
P (�, H | D)
H
Landis et al. (Syst Biol, in press)
1. Sample biogeographic histories, H
Landis et al. (Syst Biol, in press)
Nielsen (Syst Biol 2002)
L�,H2. Compute likelihood,
Range evolu>on events from range : sum of rates leaving prob any event at >me prob next event is
= product of event types & >mes over tree L�,H
ri/rre�rt
j
Landis et al. (Syst Biol, in press)
r =X
rj
P (�, H | D)
L�,Hhigh L�,Hlow
3. Approximate using MCMC P (�, H | D)
Landis et al. (Syst Biol, in press)
Can we infer distance effects?
Distance-‐dependent dispersal model Redistributes the rate of area gain…
Simula>on: 600 areas, 50 replicates, 8 distances
Landis et al. (Syst Biol, in press)
½ ¼ 0 1 2 3 4 6
Nearby
Collapses to “independence” model
Anywhere
BayArea recovers true parameters
Landis et al. (Syst Biol, in press)
00.25
0.51
23
46
0.0035 0.0040 0.0045 0.0050 0.0055 0.0060
Data sim
ulated under distance power, β
Mean posterior of rate of gain, λ1
00.25
0.51
23
46
0.035 0.040 0.045 0.050 0.055 0.060
Data sim
ulated under distance power, β
Mean posterior of rate of loss, λ0
Distance effe
cts 0
0.250.5
12
34
6
0 2 4 6
Data sim
ulated under distance power, β
Mean posterior of distance power, β
0 ¼ ½ 1 2 3 4 6
Rate of area loss Rate of area gain Distance effects
Bayes factors iden>fy distance effects
0
25
50
75
100
0 0.25 0.5 1 2 3 4 6
Simulation dataset per
% o
f sim
ulat
ions
favo
ring
MD
BFD0 support for MD
Favors M0
Insubstantial
Substantial
Strong
Very strong
Decisive
0
25
50
75
100
0 0.25 0.5 1 2 3 4 6
Simulation dataset per
% o
f sim
ulat
ions
favo
ring
MD
BFD0 support for MD
Favors M0
Insubstantial
Substantial
Strong
Very strong
Decisive
0 ¼ ½ 1 2 3 4 6
100%
0%
50%
25%
75%
Landis et al. (Syst Biol, in press)
Bayes factors support for distance model
% of sim
ula>
ons sup
ported
Malesian Rhododendron Vireya
Landis et al. (Syst Biol, in press)
Re-‐analysis of Webb & Ree (2012) work 65 species, 20 areas
Malesian Rhododendron Vireya
Landis et al. (Syst Biol, in press)
Re-‐analysis of Webb & Ree (2012) work 65 species, 20 areas
Wallace’s Line
Known dispersal barrier Vireya crossing?
Malesian Rhododendron Vireya
Landis et al. (Syst Biol, in press)
Re-‐analysis of Webb & Ree (2012) work 65 species, 20 areas
Wallace’s Line
Known dispersal barrier Vireya crossing?
Data from
Brown et al. (J Biogeogr 2012) Webb & Ree (Chapter 8 in Bio5c Evolu5on and Environmental Change in Southeast Asia 2012)
Distance maYers for Vireya dispersal
0.10 0.15 0.20 0.25
Rate of area loss, λ0
0.005 0.015 0.025
Rate of area gain, λ1
-4 -2 0 2 4
Distance power, βRate of area loss Rate of area gain Distance
Landis et al. (Syst Biol, in press)
Prior
Posterior
1.00.50.0Node maps:Posterior probabilityof presence per area
Branches:% of inferred rangeeast of Wallace’s Line
0.0 0.5 1.0
W E
W
E
B)
Wallace’s Line: 3+ crossings West East
Wallace’s Line & Lydekker’s Line: 1 crossing West East
East of Wallace’s Line West of Wallace’s Line
Posterior of ancestral ranges
Phylowood: biogeographic anima>ons
Future direc>ons Rate-‐modifiers for other traits/features Incorpora>ng on paleo-‐etc.-‐ical data Occupancy models to handle “false absences” Specia>on models (allopatry vs sympatry) Adding to RevBayes (easy to develop models)
Summary Allows hundreds of areas for analysis Joint posterior of parameters and ancestral ranges Simple distance-‐dependent dispersal model Efficient model tes>ng framework Open-‐source soqware available
Thanks! Ques>ons?
Contact
[email protected] twiYer.com/landismj
BayArea Biogeography for many areas Nick Matzke bayarea.google.code.com Brian Moore John Huelsenbeck
Phylowood Biogeographic anima>ons
Trevor Bedford mlandis.github.com/phylowood Helpful folks
Bas>en Boussau Tracy Heath Josh Schraiber Sebas>an Höhna
Extra slides
Malesian paleogeography ES42CH10-Lohman ARI 26 September 2011 14:33
Land
Deep seaTrenches
Shallow seaLakes
Volcanoes
Carbonateplatforms
Highlands
110˚E 120˚E 130˚E100˚E 110˚E 120˚E 130˚E100˚E
110˚E 120˚E 130˚E100˚E
10˚S
20˚S
0˚
10˚S
20˚S20˚S
0˚
10˚S
20˚S
0˚
110˚E 120˚E 130˚E100˚E
10˚S
20˚S
0˚
10˚S
20˚S
0˚
10˚S
0˚
a b
c d
e f
60 Mya Paleocene
40 Mya Late Eocene
30 MyaMiddle Oligocene
20 MyaEarly Miocene
10 MyaLate Miocene
5 MyaEarly Pliocene
Figure 2Six Cenozoic reconstructions of land and sea in the Indo-Australian Archipelago.
www.annualreviews.org • Indo-Australian Biogeography 207
Ann
u. R
ev. E
col.
Evol
. Sys
t. 20
11.4
2:20
5-22
6. D
ownl
oade
d fr
om w
ww
.ann
ualre
view
s.org
by U
nive
rsity
of C
alifo
rnia
- B
erke
ley
on 1
2/06
/12.
For
per
sona
l use
onl
y.
Lohman et al. (2011)
We assume constant geography, but…
Vireya results Assume Vireya root age is 55 Mya Ancestral range posterior
Joint WL and LL crossing once ~40 Mya All other WL crossings < 15 Mya
Plausible biogeographical scenario
Single long distance dispersal event around 40 Mya As Sundi and Sahul Shelf converge, repeated short dispersals
Dispersal-‐ex>nc>on model
R(a)Yi,Yj
=
8>>><
>>>:
�0 if Yj,a = 0
�1 if Yj,a = 1
0 if Yi and Yj di↵er at more than one area
0 if Yj = (0, 0, . . . , 0)
iid, Jukes-‐Cantor, forbids ex>nc>on
Rate-‐modified dispersal model
only a single area can be gained or lost. In other words, each row of Q contains up to N positive,
non-zero entries, which correspond to the rates at which any one of the N areas switches between
absent and present (i.e., the N 0 ! 1 and 1 ! 0 positive entries of the row). Additionally, each
row contains a single, negative diagonal entry, which accounts for the time during which no change
in geographic range occurs, defined as Qii = � Pi 6=j Qij , and ensures that each row of of Q sums
to zero. The remaining entries in Q have a value of zero, as they entail an instantaneous change in
geographic distribution involving two or more areas.
We define a distance-dependent dispersal model, MD, where the rate of gaining a particular area
(0 ! 1) depends on the relative proximity of available areas to those currently occupied by a lineage;
that is, the rate of colonizing a nearby area just outside the perimeter of the current geographic
range should be greater than that of colonizing a relatively remote area. The precise nature of the
relationship between geographic distance and dispersal probability might be specified in numerous
ways (see, e.g., Wallace 1887; MacArthur and Wilson 1967; Hanski 1998). Our distance-dependent
model specifies a simple relationship in which the probability of dispersal between two areas is
inversely related to the geographic distance between them.
Let R(a)Yi,Yj
be the rate of change from the geographic range Yi to the geographic range Yj , where
Yi and Yj di↵er only at the single area index a (again, reflecting the fact that this is a one-change-
at-a-time model). Also, let �0 2 ✓ and �1 2 ✓ be the respective rates at which an individual area
is lost or gained within a geographic range, and ⌘(Yi, Yj , a, �) be a dispersal-rate modifier that
accounts for correlative distance e↵ects. We define the instantaneous dispersal rate as
R(a)Yi,Yj
=
8>>>>>>>>>><
>>>>>>>>>>:
�0 if Yj,a = 0
�1⌘(Yi, Yj , a, �) if Yj,a = 1
0 if Yi and Yj di↵er at more than one area
0 if Yj = (0, 0, . . . , 0)
(1)
and the distance-dependent dispersal rate modifier as
⌘(Yi, Yj , a, �) =NX
n=1
1{Yi,n=1}d(Gn, Ga)��
⇥PN
m=1 1{Yj,m=0}PN
m=1 1{Yj,m=0}
⇣PNn=1 1{Yi,n=1}d(Gn, Gm)��
⌘ (2)
7
Per-‐area rate of gain depends on current biogeographical range.
L�,HCompute likelihood,
0111 2 3
0011 2 3
1011 2 3
1011 2 3
Distance-‐dependent dispersal model
Figure 2: Cartoon of the computation of the distance-dependent dispersal rate-modifier, ⌘(·). Here,
we are interested in computing the rate of Yi = (1, 1, 0, 0) transitioning to Yj = (1, 1, 0, 1). The
first term computes the sum of inverse distances raised to the power � between the area of interest
(i.e., 4) and all currently occupied areas (i.e., areas 1 and 2). The second term then normalizes this
quantity by dividing by the sum of inverse distances raised to the power � between all occupied-
unoccupied area-pairs (i.e., the denominator), then multiplying by number of currently unoccupied
areas (i.e., 2, the numerator).
0 01 2
3 40 0
1 2
3 4
0 01 2
3 4
⌘(Yi = (1, 1, 0, 0) ! Yj = (1, 1, 0, 1), a = 4, �) =
d(G1, G4)�� + d(G2, G4)
��
| {z }
⇥ 2
d(G1, G3)�� + d(G2, G3)
��
| {z } + d(G1, G4)�� + d(G2, G4)
��
| {z }
1 1
1 1 1 1
28
Rate-‐modifier
Normaliza>on
BayArea recovers rate of area gain
0 0.25 0.5 1 2 3 4 6
0.0035
0.0040
0.0045
0.0050
0.0055
0.0060
Data simulated under distance power, β
Mea
n po
ster
ior o
f rat
e of
gai
n, λ1
Posterior rate of area gain
True rate
True distance effects
Landis et al. (Syst Biol, in press)
0 0.25 0.5 1 2 3 4 6
0.035
0.040
0.045
0.050
0.055
0.060
Data simulated under distance power, β
Mea
n po
ster
ior o
f rat
e of
loss
, λ0
BayArea recovers rate of area loss
True distance effects
Posterior rate of area loss
True rate
Landis et al. (Syst Biol, in press)
BayArea recovers distance effects
0 0.25 0.5 1 2 3 4 6
02
46
Data simulated under distance power, β
Mea
n po
ster
ior o
f dis
tanc
e po
wer
, β
True value Po
sterior o
f distance effe
cts
True distance effects Landis et al. (Syst Biol, in press)