Data and Statistics: New methods and future challenges Phil O’Neill University of Nottingham

Data and Statistics: New methods and future challenges

Phil O’NeillUniversity of Nottingham

Professors: How theyspend their time

1. High-resolution genetic data2. Model assessment

Gardy 2011 NEJM

“High-resolution genetic data”: what are they?

individual-level data on the pathogen can be taken at single or multiple time points high-dimensional e.g. whole genome sequences proportion of individuals sampled could be high/low becoming far more common due to cost reduction

“High-resolution genetic data”: what use are they?

better inference about transmission paths more reliable estimates of epi quantities? understand evolution of the pathogen

A C C C T T G G G A A A .....

Modelling and Data Analysis methods

Two kinds of approaches exist:

1. Separate genetic and epidemic components (e.g. Volz, Rasmussen)

2. Combine genetic and epidemic components (e.g. Ypma, Worby, Morelli)

1. Separate genetic and epidemic components e.g: - estimate phylogenetic tree - given the tree, fit epidemic modelor - cluster individuals into genetically similar groups - given the groups, fit multi-type epidemic model

1. Separate genetic and epidemic components + “Simple” approach + Avoids complex modelling

- Ignores any relationship between transmission and genetic information

2. Combine genetic and epidemic components e.g: - model genetic evolution explicitly - define model featuring both genetic and epidemic parts

2. Combine genetic and epidemic components + “Integrated” approach - Is modelling too detailed? - Initial conditions: typical sequence?

+/- Model differences between individuals instead?

1. High-resolution genetic data2. Model assessment

“Model assessment”: what is it?

Does our model fit the data? Is there a better model?

“Model assessment”: why do it?

Poor fit sheds doubt on conclusions from modelling Model choice can be a tool for directly addressing questions of interest

Linear regression: yk = axk + b + ek , ek ~ N(0,v)

Minimise distance of model mean from observed data

Linear regression: yk = axk + b + ek , ek ~ N(0,v)

Minimise distance of model mean from observed data

For outbreak data:

What are the right residuals? Should observed or unobserved data be compared to the model? (Streftaris and Gibson) Mean model may only be available via simulation Is the mean the right quantity to consider?

For outbreak data:

What are the right residuals? Should observed or unobserved data be compared to the model? (Streftaris and Gibson) Mean model may only be available via simulation Is the mean the right quantity to consider?

Simulation-based approaches to model fit:

Forward simulation – “close” to data? Choice of summary statistics? Close ties to ABC methods (McKinley, Neal)

Approaches to model choice

Hypermodels/saturated models Bayesian non-parametric methods Bayesian methods e.g. RJMCMC Mixture models

Hypermodels/saturated models

e.g. Infection rates βS or βSI or βSI0.5 in an SIR model? Instead use βSI and estimate

(O’Neill and Wen)

Bayesian non-parametric methods

e.g. Infection rate β(t)SI or β(t) in an SIR model; Estimate β(t) in a Bayesian non-parametric manner using Gaussian process machinery

(Kypraios, O’Neill and Xu; Knock and Kypraios)

Reversible Jump MCMC

e.g. Distinct models (usually small number), estimate Bayes factors by running MCMC on union of parameter spaces (O’Neill; Neal and Roberts; Knock and O’Neill)

Mixture models

e.g. Given two models (f, g), create mixture model

f(x) = g(x) + (1- ) h(x);

estimation of enables estimation of Bayes Factors (Kypraios and O’Neill)

Data and Statistics: New methods and future challenges Phil O’Neill University of Nottingham

Documents

DAVID AUSTIN O’NEILL, SR. - UFDC Image Array 2ufdcimages.uflib.ufl.edu/UF/E0/04/61/20/00001/O39NEILL_D.pdf3 To Silohe O’Neill, David Austin O’Neill, II, and David Austin O’Neill,

Fourth Sunday in Lent · 3/22/2020 · Carlino, Gert & John Lincoln, Linda & Gene Glomb, Phil & BJ Budzynski, Santi Ceccotti, Pat Mannion, Joan O’Neill, Scott Cook, Jennifer Donnelly,

Andrew O’Neill

Phil Nottingham - Video for SEO, CRO, Social and Beyond!

Sandler O’Neill + Partners

Eugene O’Neill (1888 – 1953)

Relating models to data: A review P.D. O’Neill University of Nottingham

University of Nottingham · University of Nottingham

University of Nottingham - Miller, Phil (2012) The …eprints.nottingham.ac.uk/12855/1/NBAS_Thesis_FOR...Thesis submitted to the University of Nottingham for the degree of Doctor of

PDF (Vol. 1) - Nottingham eTheses - University of Nottingham

Phil Nottingham: Building a More Human Brand — #FullStack15

E-portfolios: National policy and local practice Phil Harley 14-19 Transition Strategy Manager City of Nottingham Local Authority, UK E-portfolios: Mapping

Data and Statistics: New methods and future challenges Phil O’Neill University of Nottingham

Portfolio Learning – The Good, The Bad and The Ugly! Phil Rayner GP Trainer and Lead Programme Director – Nottingham GP StP

FRAN O’NEILL - davidandschweitzer.com

Conor O’Neill

eStudio34 presents London Search Love 2015 | Building a Social Video Stategy by Phil Nottingham

Frank O’Neill and the O’Neill Glass Machines · Frank O’Neill and the O’Neill Glass Machines Bill Lockhart A very unusual base scar, made by a heretofore unknown glass machine

Phil Nottingham - How the best strategies start with the right metrics

[Elite Camp 2016] Phil Nottingham - CRO with Video: Tips, Tricks and Tactics