Generalized Additive Models · What is smoothing? How do GAMs work? (Roughly) Fitting and plotting simple models. What is a GAM? Generalized Additive Models Generalized: many response

GeneralizedAdditiveModelsDavidLMiller

OverviewWhatisaGAM?

Whatissmoothing?

HowdoGAMswork?(Roughly)

Fittingandplottingsimplemodels

WhatisaGAM?

GeneralizedAdditiveModelsGeneralized:manyresponsedistributions

Additive:termsaddtogether

Models:well,it'samodel…

ToGAMsfromGLMsandLMs

(Generalized)LinearModelsModelsthatlooklike:

(describetheresponse, ,aslinearcombinationofthecovariates, ,withanoffset)

Wecanmake anyexponentialfamilydistribution(Normal,Poisson,etc).

Errorterm isnormallydistributed(usually).

= + + + … +yi β0 x1iβ1 x2iβ2 ϵiyi

xji

∼yi

ϵi

Whybotherwithanythingmorecomplicated?!

Isthislinear?

Isthislinear?Maybe?lm(y ~ x1, data=dat)

Whatcanwedo?

Addingaquadraticterm?lm(y ~ x1 + poly(x1, 2), data=dat)

Isthissustainable?Addinginquadratic(andhigherterms)canmakesense

Thisfeelsabitadhoc

Betterifwehadaframeworktodealwiththeseissues?

[drumroll]

Whatdoesamodellooklike?

where , (fornow)

Rememberthatwe'remodellingthemeanofthisdistribution!

Calltheaboveequationthelinearpredictor

= + ( ) +yi β0 ∑jsj xji ϵi

∼ N(0, )ϵi σ2 ∼ Normalyi

Okay,butwhataboutthese"s"things?Think =smooth

Wanttomodelthecovariatesflexibly

Covariatesandresponsenotnecessarilylinearlyrelated!

Wantsome“wiggles”

s

Okay,butwhataboutthese"s"things?Think =smooth

Wanttomodelthecovariatesflexibly

Covariatesandresponsenotnecessarilylinearlyrelated!

Wantsome“wiggles”

s

Whatissmoothing?

Straightlinesvs.interpolationWantalinethatis“close”toallthedata

Don'twantinterpolation–weknowthereis“error”

Balancebetweeninterpolationand“fit”

SplinesFunctionsmadeofother,simplerfunctions

Basisfunctions ,estimate

Makesthemath(s)mucheasier

bk βks(x) = (x)∑K

k=1 βkbk

DesignmatricesWeoftenwritemodelsas

isourdata

areparametersweneedtoestimate

ForaGAMit'sthesame

hascolumnsforeachbasis,evaluatedateachobservation

again,thisisthelinearpredictor

XβXβ

X

MeasuringwigglynessVisually:

Lotsofwiggles==NOTSMOOTH

Straightline==VERYSMOOTH

Howdowedothismathematically?

Derivatives!

(Calculuswasausefulclassafterall!)

Wigglynessbyderivatives

Whatwasthatgreybit?

(Takesomederivativesofthesmoothandintegratethemover )

(Turnsoutwecanalwayswritethisas ,sothe isseparatefromthederivatives)

(Call thepenaltymatrix)

dx∫ℝ ( )f (x)∂2

x∂2

2

x

SββT β

S

Makingwigglynessmattermeasureswigglyness

“Likelihood”measuresclosenesstothedata

Penaliseclosenesstothedata…

Useasmoothingparametertodecideonthattrade-off…

Estimatethe termsbutpenaliseobjective

“closenesstodata”+penalty

SββT

λ SββT

βk

Smoothingparameter

SmoothingparameterselectionManymethods:AIC,Mallow's ,GCV,ML,REML

Recommendation,basedonsimulationandpractice:

UseREMLorML

Reiss&Ogden(2009),Wood(2011)

Cp

MaximumwigglinessWecansetbasiscomplexityor“size”( )

Maximumwigglyness

Smoothshaveeffectivedegreesoffreedom(EDF)

EDF<

Set “largeenough”

Penaltydoestherest

Moreonthisinabit…

k

kk

GAMsummaryStraightlinessuck—wewantwiggles

Uselittlefunctions(basisfunctions)tomakebigfunctions(smooths)

Needtomakesureyoursmoothsarewigglyenough

Useapenaltytotradeoffwiggliness/generality

FittingGAMsinpractice

TranslatingmathsintoRAsimpleexample:

where

Let'spretendthat

linearpredictor:formula = y ~ s(x) + s(w)responsedistribution:family=gaussian()data:data=some_data_frame

= + s(x) + s(w) +yi β0 ϵi∼ N(0, )ϵi σ2

∼ Normalyi

Puttingthattogether

method="REML"usesREMLforsmoothnessselection(defaultis"GCV.Cp")

my_model <- gam(y ~ s(x) + s(w), family = gaussian(), data = some_data_frame, method = "REML")

Whataboutapracticalexample?

PantropicalspotteddolphinsExampletakenfromMilleretal(2013)

hasabetteranalysis

Simpleexamplehere,ignoringallkindsofimportantstuff!

Paperappendix

http://distancesampling.org/R/vignettes/mexico-analysis.html

InferentialaimsHowmanydolphinsarethere?

Wherearethedolphins?

Whataretheyinterestedin?

Asimpledolphinmodel

countisafunctionofdepth

off.setistheeffortexpendedwehavecountdata,tryquasi-Poissondistribution

library(mgcv)dolphins_depth <- gam(count ~ s(depth) + offset(off.set), data = mexdolphins, family = quasipoisson(), method = "REML")

Whatdidthatdo?summary(dolphins_depth)

Family: quasipoisson Link function: log

Formula:count ~ s(depth) + offset(off.set)

Parametric coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -18.2344 0.8949 -20.38 <2e-16 ***---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms: edf Ref.df F p-value s(depth) 6.592 7.534 2.329 0.0224 *---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) = 0.0545 Deviance explained = 26.4%-REML = 948.28 Scale est. = 145.34 n = 387

Plottingplot(dolphins_depth)Dashedlinesindicate+/-2standarderrors

Rugplot

Onthelinkscale

EDFon axisy

ThinplateregressionsplinesDefaultbasis

Onebasisfunctionperdatapoint

Reduce#basisfunctions(eigendecomposition)

Fittingonreducedproblem

Multidimensional

Wood(2003)

BivariatetermsAssumedanadditivestructure

Nointeraction

Wecanspecifys(x,y)(ands(x,y,z,...))(Assumingisotropyhere…)

AddingatermAddasurfaceforlocation( and )

Justuse+foranextratermx y

dolphins_depth_xy <- gam(count ~ s(depth) + s(x, y) + offset(off.set), data = mexdolphins, family=quasipoisson(), method="REML")

Summarysummary(dolphins_depth_xy)

Family: quasipoisson Link function: log

Formula:count ~ s(depth) + s(x, y) + offset(off.set)

Parametric coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -19.1933 0.9468 -20.27 <2e-16 ***---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms: edf Ref.df F p-values(depth) 6.804 7.669 1.461 0.191s(x,y) 23.639 26.544 1.358 0.114

R-sq.(adj) = 0.22 Deviance explained = 49.9%-REML = 923.9 Scale est. = 79.474 n = 387

Plotting

scale=0:eachplotondifferentscalepages=1:plottogether

plot(dolphins_depth_xy, scale=0, pages=1)

Plotting2dterms...erm...

select=pickswhichsmoothtoplot

plot(dolphins_depth_xy, select=2, cex=2, asp=1, lwd=2)

Let'strysomethingdifferent

scheme=2muchbetterforbivariateterms

vis.gam()ismuchmoregeneral

plot(dolphins_depth_xy, select=2, cex=2, asp=1, lwd=2, scheme=2)

Morecomplexplotspar(mfrow=c(1,2))vis.gam(dolphins_depth_xy, view=c("depth","x"), too.far=0.1, phi=30, theta=45)vis.gam(dolphins_depth_xy, view=c("depth","x"), plot.type="contour", too.far=0.1,asp=1/1000)

Fitting/plottingGAMssummarygamdoesalltheworkverysimilartoglmsindicatesasmoothterm

plotcangivesimpleplots

vis.gamformoreadvancedstuff

Prediction

Whatisaprediction?Evaluatethemodel,ataparticularcovariatecombination

Answering(e.g.)thequestion“atagivendepth,howmanydolphins?”

Steps:

1. evaluatethe terms

2. movetotheresponsescale(exponentiate?Donothing?)

3. (multiplyanyoffsetetc)

s(…)

Exampleofpredictioninmaths:

Model:

Dropinthevaluesof (and )

inR:

buildadata.framewithusepredict()

(se.fit=TRUEgivesastandarderrorforeachprediction)

= exp( + s( , ) + s( ))counti Ai β0 xi yi Depthix, y, Depth A

x, y, Depth, A

preds <- predict(my_model, newdat=my_data, type="response")

Backtothedolphins...

Wherearethedolphins?

(ggplot2codeincludedintheslidesource)

dolphin_preds <- predict(dolphins_depth_xy, newdata=preddata, type="response")

PredictionsummaryEvaluatethefittedmodelatagivenpoint

Canevaluatemanyatonce(data.frame)Don'tforgetthetype=...argument!

Obtainper-predictionstandarderrorwithse.fit

Whataboutuncertainty?

Withoutuncertainty,we'renotdoingstatistics

Wheredoesuncertaintycomefrom?:uncertaintyinthesplineparameters

:uncertaintyinthesmoothingparameter

(Traditionallywe'veonlyaddressedtheformer)

(Newtoolsletusaddressthelatter…)

βλ

ParameteruncertaintyFromtheory:

(caveat:thenormalityisonlyapproximatefornon-normalresponse)

Whatdoesthismean?Varianceforeachparameter.

Inmgcv:vcov(model)returns .

β ∼ N( , )β̂ Vβ

Vβ

Whatcanwedothisthis?confidenceintervalsinplotstandarderrorsusingse.fitderivedquantities?(seebibliography)

Thelpmatrix,magic,etcForregularpredictions:

form usingthepredictiondata,evaluatingbasisfunctionsaswego.

(Needtoapplythelinkfunctionto )

Butthe fundoesn'tstopthere…

=η̂p Lpβ̂

Lp

η̂p

Lp

[[mathematicsintensifies]]

VarianceandlpmatrixTogetvarianceonthescaleofthelinearpredictor:

pre-/post-multiplicationshiftsthevariancematrixfromparameterspacetolinearpredictor-space.

(Canthenpre-/post-multiplybyderivativesofthelinktoputvarianceonresponsescale)

=Vη̂ LTpVβ̂Lp

Simulatingparametershasadistribution,wecansimulateβ

UncertaintyinsmoothingparameterRecentworkbySimonWood

“smoothingparameteruncertaintycorrected”versionof

Inafittedmodel,wehave:

$Vpwhatwegotwithvcov$Vcthecorrectedversion

Stillexperimental

Vβ̂

VariancesummaryEverythingcomesfromvarianceofparameters

Needtore-project/scalethemtogetthequantitiesweneed

mgcvdoesmostofthehardworkforus

Fancystuffpossiblewithalittlemaths

Canincludeuncertaintyinthesmoothingparametertoo

Okay,thatwasalotofinformation

SummaryGAMsareGLMsplussomeextrawiggles

Needtomakesurethingsarejustwigglyenough

Basis+penaltyisthewaytodothis

Fittinglookslikeglmwithextras()terms

Moststuffcomesdowntomatrixalgebra,thatmgcvsheildsyoufrom

Todofancystuff,getinsidethematrices

COFFEE

Documents

Generalized Additive Models · What is smoothing? How do GAMs work? (Roughly) Fitting and plotting simple models. What is a GAM? Generalized Additive Models Generalized: many response