Model Selection Using AIC and BIC

Common Model Selection

Statistics: AIC and BIC

Addictions Research Seminar

August 8, 2007

2@ David FarleyA visit from a flying mutant ______.

Elephant Model

Objectives

• Know what AIC and BIC do

• Know the role that some statisticians

think AIC and BIC should play

in research

• Be aware of alternatives

• Motivation to look further

Outline

• Objectives

• Model Selection (MS) Problems

• Commonly Used MS Statistics

– Motivations

– Use

• Alternatives and Recommendations

Model Selection

• What you do depends on:

– Study Design

– Suite of Collected Variables

– Purpose

– Philosophy on model building

Model Selection

• Model selection is not model testing

• Psychological model/theory vs.

statistical model

Research Context

• Null Hypothesis Significance Testing

• Model Testing– Testing Structure

– Parameter Testing

• Exploratory/Model Building– Descriptive motivations

– Predictive utility

– Evidence production

Model Selection: Approaches

1. Just select the full model only

2. Use stepwise selection, ignore selection uncertainty

3. Use MS statistic, ignore selection uncertainty

4. Use MS statistic & consider uncertainty

5. Do multimodel inference.

6. First reduce predictors and thoughtfully weigh models considering MS statistics

Although [MS Stats] are helpful exploratory tools, the model-building process should utilize theory and common sense.

Alan Agresti

Model selection is rarely based solely on [MS Stats] but depends also on the purpose of the analysis and subject matter information.

Jouni Kuha

Model Selection Criteria

• Test of hypotheses (NHST)

• Ad hoc methods

• Optimization of some selection criteria

– Criteria based on MSE, MS prediction error

– Information Criteria

– Consistent estimators of P(true model)

NHST does not mesh with IC in model

selection

“A very common mistake seen in the applied

literature is to “test” to see whether the best model

is significantly better than the second best model”Anderson & Burnham 2002

Using Statistics to Help Guide

Model Fit (MF)

• R2

• MSE

Model Selection (MS)

• AIC

• BIC

• TIC, NIC, EIC, FIC, GIC, SIC,QAIC, Cp, PRESS, CAICF, MDL, HQ, Vapnik-Chernovekis D…

AIC Motivation

• A measure of the predictive

performance of the models

• It is based on information loss

AIC Motivation

• Based on Kullback-Leibler (K-L) information

• I(f,g) is the information loss due to the use of

a model to approximate reality

• Turns out that you can compare models’

relative information loss without knowing

being able to describe reality exactly

AIC Motivation

Akaike found that the log-likelihood value of a model was a biased estimate of the relative information loss. The bias was approximately equal to the number of parameters in the model.

relativeE(K - L) = ℓ(q | D) - P

AIC Functionality

• AIC selects a best model in terms of the

bias/variance trade-off, not a quasi-true

• The target model changes with the

sample size.

• AIC is not consistent. There is always a

possibility it will select models with too

many variables (without finite sample

adjustments).

• AIC is efficient. The expected prediction

error for AIC selected models is the

smallest possible (N being large).

What AIC Values Mean

AIC values are not interpretable. They

contain arbitrary constants.

Di = AICi - AICmin

4 £ Di £ 7

Di >10

Considerable Less Support

Difficult to Support

When all the models have very low weights,

there is no inferential credibility for any single

model regarding what are the “important”

predictor variables. It is foolish to think that

the variables included in the best model are

“the” important ones and the excluded are not

important.”

Burnham & Anderson 2002

AIC = 2[ℓ(q2 ) - ℓ(q1)]- 2(p2 - p1)

BIC = 2[ℓ(q2 ) - ℓ(q1)]- logn(p2 - p1)

BIC Motivation

“Aim of Bayesian approach is to identify the model with the highest probability of being the true model”

Kuha 2004

“The assumed purpose of the BIC-selected model was often simple prediction: as opposed to scientific understanding of the system under study”

Burnham & Anderson 2002

BIC Motivation

BF12 = p(D |M2 ) / p(D |M1)

Bayes Factor = evidence in favor

of model 2 over model 1

BIC is an approximation of a transformation of

the Bayes Factor (for a limited set of priors).

BIC does not always need to be a good

approximation of the Bayes Factor if it is

used mainly to identify which of the of

the models has the highest posterior

probability.

Justification for BIC

BIC is consistent. It asymptotically

reaches its goal.

Meaning of BIC Values

pi is the posterior probability that model i

is the true model.

(assuming that that there is a true model

and that it is in your model set)

pi =e(- 1

2DBICi )

å e(- 12DBICR )

PDA Model

Time MS Stats

LinearQuadratic Cubic HDRS Tx PDA1Attendance AIC dAIC BIC P(T.M.)

1 x x x x x -174 2 -112 0.002428

2 x x x x x x -141 35 -70 1.84E-12

3 x x x x x x -166 10 -100 6.02E-06

4 x x x -151 25 -99 3.65E-06

5 x x x x -176 0 -124 0.979624

6 x x x -158 18 -116 0.017942

Similarities• Penalized model selection

criteria

• Data must be fixed

• They can be special cases of each other

• Both good at approximating target quantities

• Bayesian or frequentist derivation

• Ambivalence

• Only as good as your data

Differences• BIC is dimension

consistent/AIC approximate

relative information loss

• BIC penalizes complex

models more than AIC

• Definition of “good model”

• Need for a true model

Burnham and Anderson

Objection to BICWe question the concept of a simple “true

model” in the biological sciences and would

surely think if it existed that it would not be in

the set of candidate models.

There is nothing in the foundation of of BIC that

addresses a bias-variance trade off, and

hence addresses parsimony as a feature of

BIC model selection.

Other’s Views

For model selection purposes, there is no clear

choice between AIC and BIC.

Kuha 2002

BIC target model doesn’t depend on N, but we

know the number of parameters selected will

so BIC can’t deliver on its objective in practice.

“All models are wrong, some models are useful.”

George Box

Any model is just a simplification of reality.

Select a model that is a useful description or powerful predictor.

Simulation Results

• BIC better than AIC when the true

model is included as a candidate and

often better than AICc

• AIC does better when true model was

not in the set

• These are not universal results

Simulation ResultsR

Kuha 2004

Alternative Approaches Exist

• Direct

• Cross-validation

• Use all the models at the same time!

• Report out top contenders

Train Validate Test

Recommendations

• Establish a philosophy

• Conduct thoughtful model building

• Use MS stats as a guide only

• Use multiple stats simultaneously

Elephant Model

Objectives

• Know what AIC and BIC do.

• Know the role that some statisticians

think AIC and BIC should play

in research.

• Be aware of alternatives.

• Motivate to learn more about AIC/BIC

Restricted Space and Directed

Selection• Akaike believed that the most important

contribution of his general approach was the

clarification of the importance of modeling

and the need for substantial, prior information

on the system being studied.

• The importance of carefully defining a small

set of candidate models cannot be

overemphasized.(A & B 2002)

Model Selection Using AIC and BIC

Data & Analytics

Bayesian model selection in cosmology - CosmoStatBayesian model selection in cosmology Martin Kilbinger Institut d’Astrophysique de Paris (IAP) ... • BIC (Bayesian Information

Package ‘lcc’ · anova.lcc 3 Details A numeric value with the corresponding AIC or BIC value. See methods for AIC objects to get more details. Author(s) Thiago de Paula Oliveira,

BIC Online Portal - nyc.gov · BIC Number - This number starts with “BIC-” (Example: BIC-9999). 2) Business Email Address - The email address which you disclosed on your BIC application

AR, MA and ARMA models - Hediberthedibert.org/wp-content/uploads/2016/04/ar-ma.pdf · Stationarity ACF Ljung-Box test White noise AR models Example PACF AIC/BIC Forecasting MA models

Nonnested Model Selection Criteriadoubleh/eco273B/nonnestmsc.pdfKeywords: Model selection criteria, Nonnested, Posterior odds, BIC 1 Introduction Fundamental to economics and econometrics

AIC Seminar - The AIC | Association of Investment Companies · 2019. 10. 14. · Manager selection is critical for achieving targeted returns ... Extensive due diligence Active engagement

Kavala,April2003 DimitrisKarlis JohnNtzoufras …karlis/Bivariate Poisson... · 2017-09-12 · Model Distribution Model Details Log-Lik Param. p.value AIC BIC 1 Poisson -432.65 64

Cp, AIC, BIC-Three Critera for Selecting Model

Data-driven neighborhood selection of a Gaussian field · When Λ is a torus, we compare it with likelihood-based methods like AIC [Aka73] and BIC [Sch78], even if they were not studied

AIC Studio - Autobasefile.autobase.biz/Autobase/AIC/AIC_StudioManual.pdf · 2018. 3. 8. · - 6 - AIC Studio 1. 소개 AIC Studio는 Autobase에서 제작한 Controller의 설정

AIC Selection Guide[1]

Model Selection Tutorial #1: Akaike's Information Criterionusers.monash.edu/~dschmidt/ModelSectionTutorial1_Sc… · · 2008-12-02Motivation Estimation AIC Derivation References

Zhidong Bai1, Yasunori Fujikoshi2 and Kwok Pui Choi3High-Dimensional Consistency of AIC and BIC for Estimating the Number of Signi cant Components in Principal Component Analysis Zhidong

Multimodel inference: understanding AIC and BIC in model selection

Model Selection for Geostatistical ModelsModel Selection for Geostatistical Models ... KEYWORDS: geospatial data, AIC, MDL, kriging, Matern autocorrelation function, orange-throated

Financial Time Series - TTUatrindad/stat5376/lec10.pdf · Order selection: use AIC or BIC or a stepwise ˜2 test Eq. (8.18). See Section 8.2.4, pp 405-406. For instance, test VAR(1)

Strong Consistency of the AIC, BIC, Cp and KOO Methods in ... · Strong Consistency of the AIC, BIC, C p and KOO Methods in High-Dimensional-Response Regression Jiang Hu (Joint work

Towards finding trends in extreme values of precipitation: A …ericg/Talks/Gilleland2016State... · 2016. 10. 27. · AIC = 165036.8, BIC = 188000.7 AIC( Model 2 ) – AIC( Model

Quality Control Check at AIC Issue 6... · 2017-04-18 · IB Diploma Subject Selection Information Afternoon for Grade 10 Students on March 17, 2017 @ 14:30 By Bob DARWISH AIC would

Change-point model selection via AIC · Ann Inst Stat Math (2015) 67:943–961 DOI 10.1007/s10463-014-0481-x Change-point model selection via AIC Yoshiyuki Ninomiya Received: 21 April