11
The Problem with Parameter Redundancy Diana Cole, University of Kent

The Problem with Parameter Redundancy

Embed Size (px)

DESCRIPTION

The Problem with Parameter Redundancy. Diana Cole, University of Kent. Parameter Redundancy. A model is parameter redundant (or non-identifiable) if you cannot estimate all the parameters. - PowerPoint PPT Presentation

Citation preview

Page 1: The Problem with Parameter Redundancy

The Problem with Parameter Redundancy

Diana Cole, University of Kent

Page 2: The Problem with Parameter Redundancy

Parameter Redundancy• A model is parameter redundant (or non-identifiable) if you cannot

estimate all the parameters.• Consider a basic occupancy models which considers whether or not

a species is present at a particular site.– Parameters: – species is detected.– Species detected at a site with probability .– Species not detected at a site with probability

– Basic model is parameter redundant – can only estimate rather than and .

• There are several different methods for detecting parameter redundancy, including – numerical methods (eg Viallefont et al, 1998)– symbolic methods (eg Cole et al, 2010) – hybrid symbolic-numeric method (Choquet and Cole, 2012)

• Generally involves calculating the rank of a matrix, which gives the number of parameters that can be estimated.

Page 3: The Problem with Parameter Redundancy

Problems with Parameter Redundancy

• There will be a flat ridge in the likelihood of a parameter redundant model (Catchpole and Morgan, 1997), resulting in more than one set of maximum likelihood estimates.

• Numerical methods to find the MLE will not pick up the flat ridge, although could be picked up trying multiple starting values and looking at profile log-likelihoods.

• The Fisher information matrix will be singular (Rothenberg, 1971) and therefore the standard errors will be undefined.

• However the exact Fisher information matrix is rarely known. Standard errors are typically approximated using a Hessian matrix obtained numerically. Can parameter redundancy be detected from the standard errors?

Page 4: The Problem with Parameter Redundancy

Is example 1 parameter redundant?Parameter Estimate Standard Error

0.39 imaginary0.64 0.0610.09 imaginary0.18 imaginary

• Hessian (H) computed numerically has rank 4 (exact Hessian would have rank < 4 if parameter redundant)

• Single Value Decomposition• Write , Matrix is diagonal matrix (Eigen values), the number of

non-zero values is the rank of the matrix.

• Standardised • Hybrid-Symbolic Numeric method: rank 3, only is estimable.• Symbolic Method: rank 3, estimable parameter combinations

Page 5: The Problem with Parameter Redundancy

Is example 2 parameter redundant?Parameter Estimate Standard Error

0.41 0.700.83 0.070.10 0.110.19 0.33

• Hessian (H) computed numerically has rank 4 (exact would have rank < 4 if parameter redundant)

• Standardised Single Value Decomposition

• Hybrid-Symbolic Numeric method: rank 3, only is estimable.• Symbolic Method: rank 3, estimable parameter combinations

Page 6: The Problem with Parameter Redundancy

Is example 3 parameter redundant?

• Standardised Single Value Decomposition [1.00 0.65 0.11 0.096 0.074 0.039 0.034 0.0011]• Hybrid-Symbolic Numeric method: rank 8 so is not parameter redundant.• Symbolic model: rank 8 so is not parameter redundant, but further test

reveal that model could be near redundant, as when model is same as example 1.

Parameter Estimate Standard Error0.37 0.190.48 0.190.39 0.200.34 0.170.40 0.200.65 0.060.10 0.030.18 0.09

Page 7: The Problem with Parameter Redundancy

Simulation Study for Example 1/2

52% have defined standard errors

Parameter True Value Average MLE St. Dev. MLE0.4 0.57 0.27

0.7 0.50 0.29

0.1 0.50 0.310.2 0.52 0.30

SVD threshold %age SVD test correct0.01 100%

0.001 72%0.0001 11%

0.00001 2%

Page 8: The Problem with Parameter Redundancy

Computer Packages and Parameter Redundancy

MARK (Cooch and Evans, 2014) • Counts the number of estimable parameters using a numerical

procedure involving a Single Value Decomposition, if “2ndPart” chosen rather than “Hessian” for variance estimation.

• Using “Hessian” method parameter redundancy is missed and agree with Cooch and Evans (2014)’s recommendation to use the default of “2ndPart”.

• Standard errors for non-identifiable parameters are either very large or zero and should be ignored. Parameter estimates for non-identifiable parameters are unreliable and should be ignored.

• Parameter redundancy could be caused by the model or the data.

• Recommend refitting any parameter redundant model with suitable constraints.

Page 9: The Problem with Parameter Redundancy

Computer Packages and Parameter Redundancy

M-surge / E-surge (Choquet et al, 2004 , Choquet et al, 2009)• Uses the hybrid-symbolic-numeric method to detect

parameter redundancy, but will not be able to tell whether parameter redundancy is caused by the model or the data. (Parameter redundancy caused by the model could be examined if you used simulated data.)

• Gives which parameters can and cannot be estimated, but cannot find estimable parameter combinations in parameter redundant models (currently only possibly symbolically)

• Also recommend refitting parameter redundant models with suitable constraints.

Page 10: The Problem with Parameter Redundancy

Conclusion• It is not always possible to tell from model fitting that a model

is parameter redundant.• Recommend at least using numeric method to check

parameter redundancy, but symbolic or hybrid methods are more reliable.

• Fitting parameter redundant models results in large bias for non-identifiable parameters and can introduce bias in the identifiable parameter models.

• If a model is parameter redundant it needs to be (re)fitted with constraints, which can be obtained using the symbolic method.

Page 11: The Problem with Parameter Redundancy

References• Catchpole, E. A. and Morgan, B. J. T (1997) Detecting parameter redundancy.

Biometrika, 84, 187-196. • Choquet, R. and Cole, D.J. (2012) A Hybrid Symbolic-Numerical Method for

Determining Model Structure. Mathematical Biosciences, 236, p117.• Choquet, R., Reboulet, A.M., Pradel, R., Gimenez, O. Lebreton, J.D. (2004). M-

SURGE: new software specifically designed for multistate capture-recapture models. Animal Biodiversity and Conservation 27(1): 207-215.

• Choquet, R., Rouan, L., Pradel, R. (2009). Program E-SURGE: a software application for fitting Multievent models. Series: Environmental and Ecological Statistics , Vol. 3 Thomson, David L.; Cooch, Evan G.; Conroy, Michael J. (Eds.) p 845-865.

• Cole, D.J., Morgan, B.J.T., Titterington, D.M. (2010) Determining the Parametric Structure of Non-Linear Models. Mathematical Biosciences, 228, 16-30.

• Cooch and Evans (2014) Program Mark. A Gentle Introduction.• Rothenberg, T.J. (1971) Identification in parametric models. Econometrica, 39,

577-591.• Viallefont, A., Lebreton, J.D., Reboulet, A.M. and Gory, G. (1998) Parameter

Identifiability and Model Selection in Capture-Recapture Models: A Numerical Approach. Biometrical Journal, 40, 313-325.