Analysing the link between traits & invasive spread in German flora: accounting for residence...

AnalysingAnalysing the link between the link between traits traits & invasive & invasive spread in German flora: accounting for spread in German flora: accounting for

residence timeresidence time

Joint work betweenEva Küster, Ingolf Kühn ~ UFZ

Adam Butler, Stijn Bierman, Glenn Marion ~ BioSS

Athens ALARM meeting, January 2007Athens ALARM meeting, January 2007

IntroductionIntroduction• Direct data on the arrival, establishment & spread of invasive

species are typically not available at the national or pan-European levels

• Indirect data about the traits & current spatial distribution of species that invaded in the past can be used to identify correlative relationships between traits and invasive success, accounting for phylogeny

• Data on traits are often missing or ambiguous, however, creating serious problems for the analysis – we look at how to address these using Bayesian methods

• We analyse data on German vascular plants• Biolflor (www.ufz.de/biolflorwww.ufz.de/biolflor):

database with information on traits & phylogeny of 3660 species

• Florkart (www.floraweb.dewww.floraweb.de):database with information on presence/absense of 4000+ species for 2995 grid

cells within Germany

• We look at neophyte species (arrivals since 1490), excluding ephemerophytes: there are 388 such species

• We use the # of grid cells occupied as a measure of invasive success

DataData

Niche breadth in Germany

# hemerobic levels

Urbanity

# of habitat types

# of vegetation formations

# phytosociological classes

Genetics

Ploidy

DNA content

Morphology

Life form

Growth form

Life span

Generative reproductive cycles

Propagation & dispersal

Types of storage organs

Existence of storage organs

Types of shoot metamorphoses

Types of root metamorphoses

Flowering phenology

Beginning of flowering season

Length of flowering season

End of flowering season

Floral & reproductive biology

Strategy types of reproduction

Mating strategy

Pollen vector

Flower colour

Floral UV pattern

Floral UV reflection

Blossom type

Diaspores & germinules

Types of diaspores

Weights of diaspores

Weights of germinules

Native global distribution

Floristic zones of native area

# floristic zones in native area

Continent of native area

# continents in native area

Native in old or new world?

Oceanity of native area

Amplitude of oceanity

Leaf traits

Leaf persistence

Leaf anatomy

Leaf form

Invasive history

Mode of introduction

Residence time

Life strategy

Ecological strategy

Ruderal life strategy

Current analysis by UFZCurrent analysis by UFZ KKüüster, Kster, Küühn and Klotz (in prep.)hn and Klotz (in prep.)

• Regress log(# grid cells occupied) onto each of the ~40 individual traits in turn, in the presence of phylogenetic variables

• Retain only traits that are significant at the 95% level, exclude non-predictive traits, & then use cluster analysis to further reduce the set of traits

• Use AIC to select the best model from within this set of traits, including interactions

• At all stages, use only those species that have complete data for all traits currently in the model

Phylogenetic correctionPhylogenetic correction KKüüster, Kster, Küühn and Klotz (in prep.)hn and Klotz (in prep.)

• Compute the patristic distance matrix based on the phylogenetic codes given in biolflor

• For the current set of species –• apply a principal coordinate analysis to the relevant part of the distance matrix• retain only axes associated with positive eigenvalues• then retain the axes that account for the first 80% of variation• then regress log(# grid cell occupied) onto the remaining axes and retain only

those that are significant at the 95% level

• The phylogenetic variables need to be recomputed whenever the set of species is changed

Missing dataMissing data

• A large number of species are currently excluded from the final analysis as data are missing on some of their traits

• This is inefficient, & could potentially lead to bias if the data are missing not at random

• The missing data arise from different sources –• there being no record in the Biolflor database• the qualifier in Biolflor suggesting that data quality is poor• multiple states being recorded for a particular trait• a very rare state being recorded

Residence timesResidence times

• Residence time is a particularly important variable because• it has good explanatory power to describe occupancy • It partly accounts for the dynamic nature of invasive processes• it allows us to make time-specific predictions about occupancy

• However, data on German residence times are only available for

171 species, & for 35 of these only to the nearest century

• Some auxiliary data is available for neighbouring countries

• How can we properly include residence time into the analysis,

given the large proportion of missing data?

Species Region Time

Amaranthus deflexus L. Germany 1889

Aesculus hippocastanum L. Germany 16th century

Acer negundo L. Czech Republic

Germany

18th cenutry

Oenothera depressa Greene Germany Early 19th century

Oxalis fontana Bunge Central Europe

Germany

17th century ?

Epilobium ciliatum Raf. Central Europe

Germany

1871 / since 1971

Nepeta grandiflora M. Bieb. Germany ca. 1900

Agrostis scabra Willd. Central Europe

Germany

Work at BioSSWork at BioSS

• The aims of our research on this at BioSS –• to explore how sensitive the results of inferences are to the

assumptions that we make about missing data • to analyse the data in such a way that species with missing

data for some traits do not need to be excluded• to relate the outputs from the the analysis to invasive risk

• We work with the Biolflor-Florkart data, and focus upon missing

data for residence times; however, the methodological ideas are

widely applicable

Application to toolkitApplication to toolkit

• Application to the prediction of invasive risk• e.g. Use traits & phylogeny to infer the number of cells

that a recently arrived species is likely to occupy after

N years of residence

• This number is uncertain, so it will be a probability

distribution rather than a single number

Bayesian methodsBayesian methods

• An alternative approach to statistical modelling and inference, in

which data are regarded as fixed and parameters are regarded

as random

• Increasingly widely used: due to improvements in computational

power it is now often possible to fit more advanced models

using Bayesian inference than using classical statistical methods

• Particularly suitable for problems that involve missing data

• Implemented using free software called WinBUGS:

extremely powerful but not particularly user-friendly…

Bayesian modellingBayesian modelling

Notation: for species i:

yi = # of grid cells occupied

ri = residence time

xi = other trait data

zi = phylogenetic variables

Basic model

log yi ~ N( + xi + zi + ri, 2)

…just the same as a GLM

Prior distributions

We use uninformative priors

, , , ~ N(0,1000)

2 ~ Gamma(1/1000, 1/1000)

• Recast the UFZ methodology in a Bayesian context, and implement this in WinBUGS

• Use this to explore potential refinements or extensions to the current analysis

• Assess sensitivity to the assumptions about missing data, phylogenetic dependence and distribution of the response variable (log-normal or Binomial)

• Implementation is in WinBUGS• develop ways of dealing more

efficiently with missing data

• Bayesian

LPJ code: Ben Smith, Stephen Sitch, Sybil Schapoff

CRU data: David Viner

GCM data: PCMDI

Statistical methods: Jonathan Rougier, Chris Glasbey

Uncertainty analysis: Bjoern Reineking, Stijn Bierman

MCMC details:

Burn-in = 5000, Sample = 2000

Thinning ratio = 1:50

ImputationImputation

• When data on residence times are missing, then we can assume that they are random variables• We can use data on the other traits, phylogeny & number of grid cells occupied to infer the distribution of the residence time for a particular species i

log ri ~ N(exp{a + bxi + czi + dyi}, s2)

• Use of the cut function ensures this does not bias inferences about , , , and

• Recast the UFZ methodology in a Bayesian context, and implement this in WinBUGS• Use this to explore potential refinements or extensions to the current analysis• Assess sensitivity to the assumptions about missing data, phylogenetic dependence and distribution of the response variable (log-normal or Binomial)

• Implementation is in WinBUGS• develop ways of dealing more efficiently with missing data

• Bayesian

LPJ code: Ben Smith, Stephen Sitch, Sybil Schapoff

CRU data: David Viner

GCM data: PCMDI

Statistical methods: Jonathan Rougier, Chris Glasbey

Uncertainty analysis: Bjoern Reineking, Stijn Bierman

Results: PloidyResults: Ploidy

Polyploid vs diploidPolyploid vs diploid

Estimate (SE) for trait effect Classical Bayesian

Trait .580 (.226) .587 (.225)

Trait + Phylogeny .636 (.220) .656 (.211)

Trait + Phylogeny + Residence .790 (.347) .630 (.216) [cut]

.761 (.199) [full]

Pink result based on 124 species

Other results based on 345 species

42 species excluded

Main model: P(parameter > 0)

Imputation model: P(parameter > 0)

> .99 b .14

1, .94 c 1, .84

> .99 d .99

Imputed valuesImputed values

PredictionsPredictions

Results: Duration of floweringResults: Duration of flowering

Trait .362 (.084) .358 (.080)

Trait + Phylogeny .329 (.083) .326 (.081)

.204 (.076) [full]

8 species excluded

> .99 b .97

.99 c > .99

> .99 D > .99

Results: End of floweringResults: End of flowering

Trait .207 (.060) .206 (.058)

Trait + Phylogeny .167 (.060) .166 (.059)

.227 (.060) [full]

8 species excluded

.96 b .17

.98 c > .99

> .99 d > .99

Results: Pollen vectorResults: Pollen vector

Estimate (SE) for trait effect Wind vs Self Insect vs Self

Classical Bayesian Classical Bayesian

Trait -1.16 (.38) -1.16 (.37) -0.71 (.32) -0.72 (.31)

Trait + Phylogeny -0.79 (.38) -0.81 (.36) -0.72 (.32) -0.72 (.31)

Trait + Phylogeny + Residence -1.22 (.51) -0.57 (.37) -0.74 (.43) -0.39 (.33)

-0.51 (.32) -0.56 (.27)

Main model Imputation model

.06, .13 b .06, .82

< .01, <.01 c < .01, .08

> .99 d .99

Pink result: 108 species

Other results: 329 species

58 species excluded

Results: Shoot metamorphosesResults: Shoot metamorphoses

a vs no Classical Bayesian

T 0.64 (.34) 0.64 (.34)

T+P 0.68 (.34) 0.70 (.34)

T+P+R 0.61 (.62) 0.82 (.35)

rh v no Classical Bayesian

T -1.06 (.35) -1.05 (.34)

T+P -0.79 (.37) -0.82 (.37)

T+P+R 0.26 (.63) -0.70 (.35)

p vs no Classical Bayesian

T 0.09 (.34) 0.10 (.32)

T+P 0.05 (.34) 0.08 (.37)

T+P+R -0.02 (.65) 0.23 (.33)

z vs no Classical Bayesian

T -1.12 (.65) -1.04 (.65)

T+P -0.24 (.75) -0.26 (.75)

T+P+R ? -0.06 (.69)

Significance of trait effect in Bayesian model: posterior probability that > 0

Trait only Trait + phylogeny

CUT: T + P + residence

Ploidy polyploid vs diploid > .99 > .99 > .99

Length of flowering season > .99 > .99 > .99

End of flowering season > .99 > .99 .94

Shoot a vs none .97 .98 .99

rh vs none < .01 .01 .02

p vs none .62 .59 .75

z vs none .05 .36 .47

Pollen vector wind vs self < .01 .01 .06

insect vs self .01 .01 .12

(Note: posterior probability that > 0 is always >0.99)

Further work 1:Further work 1:

Data Not Missing at RandomData Not Missing at Random

• Our model assumes that the data on residence times are missing at random, as does the approach of excluding missing data

• We can also consider possible mechanisms by which the missing data might be related to the variables of interest

Let oi = 1 if residence time observed for species i, 0 otherwise

• We could assume that

oi ~ Binomial(1, logit-1{A + Bxi + Czi + Dyi + Eri})

• The parameter E cannot be estimated, but we can assess sensitivity to the value of it; we assume here that E is negative

Trait effect: estimate (SE)

Mean (Q2.5%, Q97.5%) imputed

residence

Trait only .206 (.058) -

+ Phylogeny .166 (.059) -

+ Residence MAR CUT .096 (.061) 114 (34, 355)

full .227 (.060) 104 (27, 351)

NMAR CUT E = -1

E = -2

E = -3

.094 (.062)

.096 (.064)

.090 (.058)

145 (44, 454)

191 (55, 619)

315 (73, 916)

Further work 2: Further work 2:

Multiple traitsMultiple traits

• Relatively low proportions of missing data for the other key traits:can just exclude these when he look at traits individually, but more problematic when we look at effects of multiple traits

• Most “missing data” for the other key traits arise because rare or duplicate trait states are recorded in Biolflor

• We would like to incorporate this information directly into the analysis, rather than attempting to impute the missing values

• We can deal with duplicate states either by assuming:• that the parameter for species that have both states is the average of the

parameters for the two states; or• by including a separate parameter for species that have duplicate traits

# treated as missing in current analysis

# with no record at all

Ploidy 42 13

Length of flowering season 8 8

End of flowering season 8 8

Pollen vector 58 37

Shoot metamorphoses 59 1

Any of the above five traits 134 54

Species Pollen vector Qualifer

Acer negundo L. Wind Always

Adonis annua L. Selfing

Insects

Unknown

Alcea Rosea L. Selfing

Insects

At failure of outcrossing

The rule

Artemisia dracunculus L. Wind The rule

Diplotaxis muralis (L.) DC. Selfing

Selfing

Insects

The rule

At failure of outcrossing

The rule

Elodea canadensis Michx. Water The rule

Epilobium ciliatum Raf.H Selfing

Cleistogamy

The rule

Missing datain current analysis

Method to deal with duplicates: Exclude Average of parameters

Separate parameter

Ploidy Polyploid vs Diploid .636 (.220) .592 (.226) .641 (.225)

Both vs Diploid - .296 (.113) .747 (.396)

Pollen vector Wind vs Self -.795 (.376) -.508 (.376) -.653 (.384)

Insect vs Self -.716 (.315) -.683 (.310) -.748 (.314)

Water vs Self - -.094 (.935) -.134 (.933)

Insect+Self vs Self - -.342 (.155) -.138 (.614)

Wind+Self vs Self - -.254 (.188) 2.18 (1.42)

Wind+Insect vs Self - -.596 (.244) .099 (1.99)

Classical analysis, model = Traits + Phylogeny

Furthur work 3: Furthur work 3:

Auxiliary residence time dataAuxiliary residence time data

• The imputation model allows us to draw inferences about residence times for species where the arrival date is unknown

• The performance of the imputation model depends upon us it containing regressors that are strongly correlated with residence time in Germany

• Possibility of using data on residence in a neighbouring country, ni, as an explanatory variable:

log ri ~ N(exp{a + bxi + czi + dyi + eni }, s2)

Furthur work 4: Furthur work 4:

Climate changeClimate change

• UFZ are using the species-level model to identify key

traits for invasive success, & then a spatial approach

to estimate impact of environmental change on these

• A non-spatial approach might involve grouping cells

according to environmental characteristics, & fitting the

species-level model seperately for each group of cells

• We are interesting in comparing these approaches

Analysing the link between traits & invasive spread in German flora: accounting for residence...

Documents

Joint work between Eva K ü ster, Ingolf K ü hn ~ UFZ

UFZ 348 Parts

1. Annual Report - UFZ€¦ · Prof. Qi Zhang and Dr. Jiang Sanyuan of Nanjing Institute of Geography & Limnology (NIGLAS), Chi- nese Academy of Sciences, Nanjing visited the UFZ

UFZ Discussion Papers 2012 Lehmann... · Publisher: Helmholtz-Zentrum für Umweltforschung GmbH -UFZ Permoserstr. 15 04318 Leipzig ISSN 1436-140X . UFZ Discussion Papers

Ingolf E. Dammasch (ROB/SIDC) Solar and Heliospheric ...€¦ · LYRA on PROBA2 Ingolf E. Dammasch (ROB/SIDC) Solar and Heliospheric Influences on the Geospace Bucharest, Romania,

Timeline Diploma Thesis - Ingolf Heinsch

Sonnenforschung und Weltraumwetter Ingolf E. Dammasch (Royal Observatory of Belgium)

Folia Entomologica Hungarica 68. (Budapest, 2007) - NHMUSpublication.nhmus.hu/pdf/folentom/FoliaEntHung_2007_Vol_68_83.pdf · Dermestes (Dermestes) haemorrhoidalis KÜSTER, ... Folia

Ulrich Küster and Birgitta König-Ries (also contains work by Michael Klein) University Jena

REPORT OF THE STUDY ON THE ECONOMICS OF ECOSYSTEMS AND BIODIVERSITY: WATER AND WETLANDS · 2012-09-27 · Environmental Research UFZ (UFZ) and Wetlands International drafted the report

UFZ Discussion Papers · UFZ Discussion Papers. Department of ... and that the combination of Rorty’s approach with elements of Habermas’ deliberative democratic theory may be

UFZ Discussion Papers - UN-SPIDER...UFZ Discussion Papers Department of Economics 13/2005 Flood damage, vulnerability and risk perception – challenges for flood damage research Frank

UFZ Discussion Papers · 1/2011 What is wrong with virtual water trading? Erik Gawel, Kristina Bernsen March 2011 Publisher: Helmholtz-Zentrum für Umweltforschung GmbH - UFZ Permoserstr

MIDWEST UFZ 338 Parts

Dennis Küster

Internato de Ginecologia Wesley Küster 10º Período

Ecological fiscal reform and conservation Irene Ring UFZ ... · Ecological fiscal reform and conservation Irene Ring UFZ, Department of Economics Seminar on financial instruments

2. Annual Report - UFZ · 5 28.04.2015 1st Workshop on BMBF CLIENT project “Managing Water Resources for Urban Catchments” at UFZ, Leipzig 6 07.05.2015 Kick-off event of the German

Report ECHA-UFZ contract ECHA/2014/341 Analysis of the ...echa.europa.eu/documents/10162/13639/fet_report_en.pdf · Email: stefan.scholz@ufz.de Phone: +49 341 235 1080 . Report ECHA-UFZ

UFZ Discussion Papers 2012... · UFZ Discussion Papers . Department of Economics . 01/2012. Urban water supply and sanitation in Mongolia: A description of the political, legal, and