82
Introduction Additive Regression Generalized Additive Regression Rate of Convergence Model Selection Criteria Adaptive Methods Semiparametric and Nonparametric Additive Regression Models Matúš Maciak Department of Probability and Mathematical Statistics March 30, 2007 Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Semiparametric and NonparametricAdditive Regression Models

Matúš Maciak

Department of Probability and Mathematical Statistics

March 30, 2007Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 2: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Contents

1 IntroductionMotivationCurse of DimensionalityAdditive Decomposition

2 Additive RegressionSpline EstimatesKernel Estimates

3 Generalized Additive Regression

4 Rate of Convergence

5 Model Selection Criteria

6 Adaptive MethodsWell-known algorithmsReal data - Example

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 3: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

The main objectives...

1 The “Curse of Dimensionality” problem - the main reason whyone tries to apply additive semiparametric and nonparametricregression approaches.

2 The most frequently used methods to obtain additive estimates.3 Generalized additive regression models - in a special case of

binary data samples.4 Expectaction - to achieve the same rates of convergence for

additive estimates as in a case of univariate regression problem.5 Model selection criteria - the optimal choice of the final model

from a set of all proposed models.6 Adaptive strategies (CIM) - RPR, MARS and PPR, etc.

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 4: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

The main objectives...

1 The “Curse of Dimensionality” problem - the main reason whyone tries to apply additive semiparametric and nonparametricregression approaches.

2 The most frequently used methods to obtain additive estimates.3 Generalized additive regression models - in a special case of

binary data samples.4 Expectaction - to achieve the same rates of convergence for

additive estimates as in a case of univariate regression problem.5 Model selection criteria - the optimal choice of the final model

from a set of all proposed models.6 Adaptive strategies (CIM) - RPR, MARS and PPR, etc.

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 5: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

The main objectives...

1 The “Curse of Dimensionality” problem - the main reason whyone tries to apply additive semiparametric and nonparametricregression approaches.

2 The most frequently used methods to obtain additive estimates.3 Generalized additive regression models - in a special case of

binary data samples.4 Expectaction - to achieve the same rates of convergence for

additive estimates as in a case of univariate regression problem.5 Model selection criteria - the optimal choice of the final model

from a set of all proposed models.6 Adaptive strategies (CIM) - RPR, MARS and PPR, etc.

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 6: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

The main objectives...

1 The “Curse of Dimensionality” problem - the main reason whyone tries to apply additive semiparametric and nonparametricregression approaches.

2 The most frequently used methods to obtain additive estimates.3 Generalized additive regression models - in a special case of

binary data samples.4 Expectaction - to achieve the same rates of convergence for

additive estimates as in a case of univariate regression problem.5 Model selection criteria - the optimal choice of the final model

from a set of all proposed models.6 Adaptive strategies (CIM) - RPR, MARS and PPR, etc.

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 7: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

The main objectives...

1 The “Curse of Dimensionality” problem - the main reason whyone tries to apply additive semiparametric and nonparametricregression approaches.

2 The most frequently used methods to obtain additive estimates.3 Generalized additive regression models - in a special case of

binary data samples.4 Expectaction - to achieve the same rates of convergence for

additive estimates as in a case of univariate regression problem.5 Model selection criteria - the optimal choice of the final model

from a set of all proposed models.6 Adaptive strategies (CIM) - RPR, MARS and PPR, etc.

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 8: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

The main objectives...

1 The “Curse of Dimensionality” problem - the main reason whyone tries to apply additive semiparametric and nonparametricregression approaches.

2 The most frequently used methods to obtain additive estimates.3 Generalized additive regression models - in a special case of

binary data samples.4 Expectaction - to achieve the same rates of convergence for

additive estimates as in a case of univariate regression problem.5 Model selection criteria - the optimal choice of the final model

from a set of all proposed models.6 Adaptive strategies (CIM) - RPR, MARS and PPR, etc.

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 9: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

Contents

1 IntroductionMotivationCurse of DimensionalityAdditive Decomposition

2 Additive RegressionSpline EstimatesKernel Estimates

3 Generalized Additive Regression

4 Rate of Convergence

5 Model Selection Criteria

6 Adaptive MethodsWell-known algorithmsReal data - Example

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 10: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

Multivariate Regression

Let X ∈ χ ⊆ RJ be a J−dimensional random variable andconsider a random variable Y with a mean µ ∈ R and the finitesecond moment EY 2 < ∞.

Let f : χ ∈ RJ → R be a J−dimensional function such thatE[Y |X = x] = f(x) - regression function of Y on X.

Regression function f(x) is supposed to be smooth up to thespecific order.

There are no other assumptions taken on the functional form ofthe function f(·) but smoothness.

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 11: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

Multivariate Kernel regression

Multidimensional Smoothing ⇒ Multidimensional Regression

f(x) = E[Y |X = x] =

∫yg(y |x)dy =

∫yp1(y , x)dy

p2(x)

Estimates of functions p1, p2 W Kernel Density Estimation

fh(x) =

∑Ni=1 κh(Xi − x)Yi∑N

i=1 κh(Xi − x),

where κh is the multivariate, multiplicative kernel and h = (h1, . . . , hJ)is a vector of appropriate bandwidths.V Problems:“Curse of Dimensionality” and low asymptotic convergence...

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 12: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

Multivariate Kernel regression

Multidimensional Smoothing ⇒ Multidimensional Regression

f(x) = E[Y |X = x] =

∫yg(y |x)dy =

∫yp1(y , x)dy

p2(x)

Estimates of functions p1, p2 W Kernel Density Estimation

fh(x) =

∑Ni=1 κh(Xi − x)Yi∑N

i=1 κh(Xi − x),

where κh is the multivariate, multiplicative kernel and h = (h1, . . . , hJ)is a vector of appropriate bandwidths.V Problems:“Curse of Dimensionality” and low asymptotic convergence...

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 13: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

Additive approaches...

Let (X, Y ) ∈ RJ+1 be a pair of random variables such thatX = (X1, . . . , XJ) and Y is a real valued variable with a meanEY = µ and the finite second moment 0 < EY 2 ≤ K < ∞.

Consider an unknown regression function f : RJ → R of Yon X ∈ RJ so that f (x) = E[Y |X = x] (f : [0, 1]J → R).

We impose one more condition:

f (x1, . . . , xJ) = µ +J∑

j=1

fj(xj)

The functional components fj areuniquely determined and Efj(Xj) = 0.

Smoothness assumption remains...(smoothness of functional components)

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 14: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

Additive Estimates...

Let (X1, Y1), (X2, Y2), . . . , (XN , YN) denote an independentrandom sample, where each pair (Xi , Yi) has the samedistribution as (X, Y ).Estimates of the true underlying regression function are given bydifferent approaches (splines techniques, B-splines and kernelestimates)Semi-parametric (Nonparametric) estimate is based on therandom sample of size N - it can be written in the additive form:

fN(x1, . . . , xJ) = Y N +J∑

j=1

fNj(xj)

Regarding to the assumption on the functional components fj onehas to consider that

∑i=1,...,N fNj(Xij) = 0 for all j ∈ {1, . . . , J}.

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 15: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

Splines vs. Kernels

Spline estimates:

1 Semi-parametric approaches2 High-dimensional data3 Extra-large sample sizes4 No asymptotic distribution5 No uniform convergence over

the whole interval6 No measure of uniform

accuracy (except L2 norm)7 So called Sledge-hammer

technique

Kernel estimates:

1 Nonparametric techniques2 Too costly for large dimension3 Too costly for large sample

sizes N ∈ N4 Asymptotic (normal)

distribution (conf. intervals)5 Uniform convergence over the

whole interval6 So called Sharp-knife

technique

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 16: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

Splines vs. Kernels

Spline estimates:

1 Semi-parametric approaches2 High-dimensional data3 Extra-large sample sizes4 No asymptotic distribution5 No uniform convergence over

the whole interval6 No measure of uniform

accuracy (except L2 norm)7 So called Sledge-hammer

technique

Kernel estimates:

1 Nonparametric techniques2 Too costly for large dimension3 Too costly for large sample

sizes N ∈ N4 Asymptotic (normal)

distribution (conf. intervals)5 Uniform convergence over the

whole interval6 So called Sharp-knife

technique

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 17: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

Contents

1 IntroductionMotivationCurse of DimensionalityAdditive Decomposition

2 Additive RegressionSpline EstimatesKernel Estimates

3 Generalized Additive Regression

4 Rate of Convergence

5 Model Selection Criteria

6 Adaptive MethodsWell-known algorithmsReal data - Example

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 18: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

What is “Curse of Dimensionality” problem?

1 The size N ∈ N of data sample required to fit J-dimensionalregression surface increases exponentially with the increasingnumber of dimensions.

2 Some limitations given by an estimation ability of the most ofmultivariate regression approaches (splines, kernels).

3 The asymptotic convergence decreases with the increasingnumber of dimensions J (according to the expression r = p

2p+J )

4 Too costly algorithms dealing with high-dimensional data withoutdimensionality reduction principle (straightforward methods)

5 Special case - the components of X are not independent...

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 19: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

Curse of Dimensionality - examples:

Consider random variables Y = (Y1, . . . , YJ) and the randomsample{Xi = (Xi1, . . . , XiJ); i = 1, . . . N} such that

Xi ∼ R([0, 1]J

), Y ∼ R

([0, 1]J

).

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 20: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

Maximum Distance vs. Euclidean Distance

Maximum distance:‖ x ‖max = maxj=1,...,J |xj |

Euclidean distance:‖ x ‖2

euc =∑

j=1,...,J x2j

Maximum distance

J = 1 J = 2 J = 3 J = 5 J = 10 d = 20N = 100 0.003838 0.054951 0.094651 0.232593 0.366015 0.571504N = 1000 0.000506 0.015051 0.053464 0.129761 0.273968 0.440982N = 10000 0.000044 0.004691 0.021613 0.044339 0.213186 0.402223N = 100000 0.000006 0.001178 0.009108 0.030709 0.159703 0.353620

Euclidean distance

J = 1 J = 2 J = 3 J = 5 J = 10 d = 20N = 100 0.003838 0.060434 0.118966 0.328987 0.660090 1.264582N = 1000 0.000506 0.017274 0.063800 0.191530 0.498007 1.000749N = 10000 0.000041 0.005546 0.027003 0.060081 0.376753 0.909363N = 100000 0.000005 0.001376 0.011672 0.052891 0.289131 0.795231

The empirical average minimum distance between two uniformlydistributed random variables in a hypercube [0, 1]J .

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 21: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

The Lower Bounds – bandwidth selection...

Lemma (Packing density in a hypercube – maximum distance)

Let Y ∼ R([0, 1]J

)and let Xi , i = 1, . . . , N is a random sample where

each Xi ∼ R([0, 1]J

). Then Bmax(N, J) is the lower bound for the

average minimum distance for the maximum distance, where

Bmax(N, J) =12· 1

N1/J· J

J + 1(1)

Lemma (Packing density in a hypercube – Euclidean distance)

Let Y ∼ R([0, 1]J

)and let Xi , i = 1, . . . , N is a random sample where

each Xi ∼ R([0, 1]J

). The Beuc(N, J) is the lower bound for the

average minimum distance for the Euclidean distance, where

Beuc(N, J) =12·√

JN1/J

· JJ + 1

(2)

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 22: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

Contents

1 IntroductionMotivationCurse of DimensionalityAdditive Decomposition

2 Additive RegressionSpline EstimatesKernel Estimates

3 Generalized Additive Regression

4 Rate of Convergence

5 Model Selection Criteria

6 Adaptive MethodsWell-known algorithmsReal data - Example

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 23: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

The additive form of regression function

Is the true underlying regression function genuinely additive?

1 V YES: → the straightforward estimation of the functionalcomponents (just an occasional case)

2 V No: → then one has to find some additive approximation,which will be consequentially estimated

How to define a measure of accuracy between the underlyingregression function and its approximation???

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 24: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

The additive form of regression function

Is the true underlying regression function genuinely additive?

1 V YES: → the straightforward estimation of the functionalcomponents (just an occasional case)

2 V No: → then one has to find some additive approximation,which will be consequentially estimated

How to define a measure of accuracy between the underlyingregression function and its approximation???

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 25: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

The additive form of regression function

Is the true underlying regression function genuinely additive?

1 V YES: → the straightforward estimation of the functionalcomponents (just an occasional case)

2 V No: → then one has to find some additive approximation,which will be consequentially estimated

How to define a measure of accuracy between the underlyingregression function and its approximation???

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 26: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

The additive form of regression function

Is the true underlying regression function genuinely additive?

1 V YES: → the straightforward estimation of the functionalcomponents (just an occasional case)

2 V No: → then one has to find some additive approximation,which will be consequentially estimated

How to define a measure of accuracy between the underlyingregression function and its approximation???

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 27: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

Additive Decomposition - approximation

Consider a regression function f which is not genuinely additive.In such a case the regression function f can be successfullydecomposed into main effects (additive decomposition).

Condition 1

Let the distribution of X ∈ [0, 1]J is absolutely continuous and let itsdensity g is bounded away from zero and infinity(∃ b > 0 ∃B > b ∀x ∈ C = [0, 1]Jb ≤ g ≤ B.

The additive approximation to f can be obtained as a sumof J univariate functions f ∗j (xj) where

f ∗j (xj) = E[f (x)|Xj = xj

]− E

[f (x)

], x = (x1, . . . , xJ) ∈ [0, 1]J .

If there are interactions between some variables required onecan obtain them in a similar way...

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 28: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

MotivationCurse of DimensionalityAdditive Decomposition

Additive Decomposition – definiteness...

Lemma 1

Let the random variable∑

j hj(Xj) has a finite second moment where

hj are functions on [0, 1]. Set δ =√

(1− b/B) and let SD(·) denotesthe standard deviation. Then each hj(Xj) has a finite second momentand the next statement holds:

SD(∑

j

hj(Xj)) ≥ ((1−δ)/2)((J−1)/2) · (SD(h1(X1))+ · · ·+SD(hJ(XJ))).

Under the Condition 1 it follows from the lemma that the functionalcomponents are uniquely determined up to set of measure zero.

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 29: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

Resumption...

1 Regression function f (x) = E[Y |X = x]

2 Estimates based on a random sample {(Xi , Yi), i = 1, . . . , N}3 Random variable Y has a mean µ ∈ R and a finete second

moment EY 2 < ∞4 With any loss of generality - we assume that function f has as

additive form - otherwise we use additive decomposition5 Functional components ff with zero mean Efj(xj) = 0

(to avoid constant functional components)

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 30: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

Contents

1 IntroductionMotivationCurse of DimensionalityAdditive Decomposition

2 Additive RegressionSpline EstimatesKernel Estimates

3 Generalized Additive Regression

4 Rate of Convergence

5 Model Selection Criteria

6 Adaptive MethodsWell-known algorithmsReal data - Example

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 31: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

Regression splines

the first method - polynomial estimates (over-fitting, etc.)

polynomial regression with penalties (not used anymore)

To avoid some problems related to polynomial regression ⇒implementation of spline approaches (piecewise polynomial)

Definition 1 - Spline function

Spline is a piecewise polynomial function of nth degree, where singlepolynomial pieces joint together in the knot points, obeying continuityconditions for function itself and its n − 1 derivatives.

Problem: How to chose the number and the positions of knots?

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 32: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

Regression splines

the first method - polynomial estimates (over-fitting, etc.)

polynomial regression with penalties (not used anymore)

To avoid some problems related to polynomial regression ⇒implementation of spline approaches (piecewise polynomial)

Definition 1 - Spline function

Spline is a piecewise polynomial function of nth degree, where singlepolynomial pieces joint together in the knot points, obeying continuityconditions for function itself and its n − 1 derivatives.

Problem: How to chose the number and the positions of knots?

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 33: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

Regression splines - power basis

Spline estimation approaches are based on a set of basis functions:

1 Spline Power basis takes a following form:{1, x , x2, . . . xn, (x − ξ1)

n+, . . . (x − ξK )n

+} (n - spline order)2 The estimate of each functional component fj is defined as:

fj(xj) =∑n

l=0 β0jx lj +

∑Kk=1 βkn(xj − ξk )n

+

3 The estimate of the underlying additive regression function isdefined as a minimizing problem

∑Ni=1(Yi − Y N −

∑Jj=1 fj(xij))

2

subject to basis coefficients β01, . . . , β0n, β1n, . . . , βKn.

0 20 40 60 80 100

0.0

0.4

0.8

0 20 40 60 80 100

0.0

0.4

0.8

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 34: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

Regression splines with penalties

In a case we redefine the former minimization problem such that weprovide minimization with respect to knots positions ⇒ there is aproblem of over-smoothing (interpolation).

Regression Splines ⇒ Regression Splines with penalties

to ensure a better flexibility of the final estimate

to get an ability to control the amount of smoothness

The estimate of the true underlying regression function f = (fj , . . . , fJ)is given by the minimization problem:

MinimizeN∑

i=1

(Yi − Y N −J∑

j=1

fj(xij))2 + λ

∫ 1

0(f ′′(x))2dx ,

where λ is so called smoothing parameter.

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 35: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

Regression splines with penalties

In a case we redefine the former minimization problem such that weprovide minimization with respect to knots positions ⇒ there is aproblem of over-smoothing (interpolation).

Regression Splines ⇒ Regression Splines with penalties

to ensure a better flexibility of the final estimate

to get an ability to control the amount of smoothness

The estimate of the true underlying regression function f = (fj , . . . , fJ)is given by the minimization problem:

MinimizeN∑

i=1

(Yi − Y N −J∑

j=1

fj(xij))2 + λ

∫ 1

0(f ′′(x))2dx ,

where λ is so called smoothing parameter.

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 36: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

B-splines basis

Consider B-splines basis of the nth order. Then it holds:1 Each B-spline function consist of n + 1 polynomial pieces2 Single pieces joint in n inner knots3 At the knot points - continuity condition up to the order n − 14 Each B-spline basis is positive over a domain spanned by n + 2

knots - everywhere else it is zero by definition5 B-spline function is overlapped by 2n another basis functions6 At any points x ∈ [0, 1] there are n + 1 nonzero basis functions

−2 0 2 4 6 8 10 12

0.00

0.10

0.20

x values

y va

lues

−5 0 5 10 15

0.00

0.04

0.08

x values

y va

lues

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 37: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

B-splines - estimation

The estimate of each functional component is written as a linearcombination of spline basis functions (piecewise polynomials ofthe degree n ∈ N).

fj∆(xj) =K+n+1∑

k=1

ϑjk ·Bkn(xj)

The estimate of the whole unknown regression function f isdefined as a following minimization problem

minϑjk∈R

N∑i=1

[Yi − Y N −

∑Jj=1 ϑjk · Bkn(xij)

]2

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 38: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

B-splines with penalties (P-splines)

In regard to ensure better control over smoothness and a betterflexibility of the final estimate, there were proposed B-splines withpenalties:

The estimate is given as a minimization problem

Minimize :N∑

i=1

(fN(Xi)− Yi)2 +

J∑j=1

λj ·∫ ξK+1

ξ0

(f ′′j∆(xj))2dxj

subject to basis coefficients ϑjk and parameter λ with the sameB-splines basis as in a case of simple B-splines estimates.

The optimal choice of the smoothing parameter λ⇒ Model Selection Criteria

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 39: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

B-splines with penalties (P-splines)

In regard to ensure better control over smoothness and a betterflexibility of the final estimate, there were proposed B-splines withpenalties:

The estimate is given as a minimization problem

Minimize :N∑

i=1

(fN(Xi)− Yi)2 +

J∑j=1

λj ·∫ ξK+1

ξ0

(f ′′j∆(xj))2dxj

subject to basis coefficients ϑjk and parameter λ with the sameB-splines basis as in a case of simple B-splines estimates.

The optimal choice of the smoothing parameter λ⇒ Model Selection Criteria

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 40: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

Power basis vs. B-spline basis

Power basis:

1 Relation between a knot andcorresponding basis function

2 Greater correlation betweenbasis functions

B-spline basis:

1 Numerically much morestable set of basis functions

2 Smaller correlation betweenbasis functions

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 41: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

Power basis vs. B-spline basis

Power basis:

1 Relation between a knot andcorresponding basis function

2 Greater correlation betweenbasis functions

B-spline basis:

1 Numerically much morestable set of basis functions

2 Smaller correlation betweenbasis functions

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 42: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

Contents

1 IntroductionMotivationCurse of DimensionalityAdditive Decomposition

2 Additive RegressionSpline EstimatesKernel Estimates

3 Generalized Additive Regression

4 Rate of Convergence

5 Model Selection Criteria

6 Adaptive MethodsWell-known algorithmsReal data - Example

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 43: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

Additive Kernel Estimates - progress

Multidimensional estimate of the unknown regression function f⇒ consequentially we estimate single components f1, . . . , fJ .

1 Motivated by additive linear regression2 First iterative procedures (backfitting algorithm)3 Some other iterative procedures (RPR, PPR, MARS)4 Proposed so called Direct Integration Method - 1994

↪→ the statistical properties of such an estimate are straightforward toderive (bias, variance, asymptotical properties, confidence int., etc.)

↪→ the asymptotical normality of DIM estimates

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 44: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

Additive Kernel Estimates - progress

Multidimensional estimate of the unknown regression function f⇒ consequentially we estimate single components f1, . . . , fJ .

1 Motivated by additive linear regression2 First iterative procedures (backfitting algorithm)3 Some other iterative procedures (RPR, PPR, MARS)4 Proposed so called Direct Integration Method - 1994

↪→ the statistical properties of such an estimate are straightforward toderive (bias, variance, asymptotical properties, confidence int., etc.)

↪→ the asymptotical normality of DIM estimates

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 45: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

Direct Integration Method

Consider a multivariate unknown regression function f (x) which is inadditive form. Let X = (X1, X) ∈ R× RJ−1 and define the functionalϕ1(x1) as follows:

ϕ1(x1) =

∫ 1

0f (x1, x)p2(x)d x

Under the assumption about the additive form of functionf = (f1, . . . , fJ), it holds that ϕ1 = f1 up to the additive constant µ.

V Multivariate Nadaraya-Watson kernel estimateV Kernel estimates of functions f () and p2

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 46: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

Direct Integration Method

the estimate of f1(x1) is given as a sample version of thefunctional ϕ1(x1):

f1(x1) =1N

N∑i=1

f (x1, Xi)

the estimate f1(x1) can be written in the form:

f1(x1) =N∑

i=1

wi(x1)Yi ,

where wi(x1) = n−1 ∑Ni=1 wi(x1, Xi). Weights wi(x1, Xi) are given

by the equation f (x1, X) =∑N

i=1 wi(x1, Xi)Yi

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 47: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Spline EstimatesKernel Estimates

Asymptotical normality of Kernel Additive Estimate

the functional components f2, . . . , fJ can be by obtained by thesimilar process, considering the functional ϕk (xk ) and a partition(Xk , X) ∈ R× RJ−1, and X = (X1, . . . , Xk−1, Xk+1, . . . XJ).

Theorem (Asymptotical normality)

Under some assumptions on N ∈ N and smoothness bandwidth hand g for Kernel estimates, it holds

N25[ϕj(xj)− ϕj(xj)

]→ N(bj(xj), vj(xj))

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 48: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Generalization into the GAM

In a case of binary data (survival time data) it is more convenient toimplement Generalized Additive Models - GAM.

1 Full model specification - the conditional distribution function ofY given X belongs to an exponential family - known link function

G[f (x)

]= µ +

J∑j=1

fj(xj)

2 Partial model specification - no restrictions on exponentialfamily - variance stays function unrestricted

If one takes a link function G to be an identity ⇒ Classical AdditiveRegression model (another choices: logit , probit , logarithm).

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 49: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

GAM - estimation

The estimation procedure is similar to that in Additive Kernelestimates.Let X = (X1, X) such that X = (X2, . . . , XJ). Let’s define ϕ1(x1):

ϕ1(x1) =

∫G[f (x1, x)] · p2(x)d x

Multidimensional Nadaraya - Watson kernel estimator⇒ nonparametric multivariate kernel estimates of p2 and f .

the estimate of f1 is unified with the estimate of ϕ1

the estimate of ϕ1 is given by the equation:

ϕ1(x1) =1N

N∑i=1

G[f (x1, Xi)], Xi = (Xi2, . . . , XiJ).

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 50: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

ADVANTAGES or DISADVANTAGES?

What is the main advantage of an additive approach?

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 51: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

The Optimal Global Rate of Convergence

The sequence {bN} is the optimal rate of convergence if:

limc→0

lim infN→∞

supf∈κ

P[‖ TN − f ‖q> c · bN

]= 1

limc→∞

lim supN→∞

supf∈κ

P[‖ TN − f ‖q> c · bN

]= 0

The optimal global rate of convergence given by Stone:

Theorem (Rate of Convergence for Nonparametric Estimates)

Let β ∈ (0, 1] and set p = k + β. Let 0 < q ≤ ∞ and set r = p−m2p+J .

Then the optimal global rate of convergence is

{N−r}, q ∈ (0,∞) {(N−1 · ln N)r}, q = ∞.

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 52: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Additive Reduction Principle

The effectiveness of the additive reduction principle onsimpleness and interpretability of the model.

“Curse of dimensionality” prevention.

The improvement of the optimal global rate of convergence.(r = p−m

2p+J −→ r = p−m2p+1 )

Predic

tor XPredictor Z

Response Y

0 5000 15000 25000

−20

−10

010

20

income

s(in

com

e,3.

12)

6 8 10 12 14 16

−20

−10

010

20education

s(ed

ucat

ion,

3.18

)

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 53: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Additive Expansion in L2 Norm

Consider and additive estimate fN of the regression function f .Set γ = 1/(2p + 1) and r = (p −m)/(2p + 1).

Theorem (Rate of Convergence for Additive Estimates)

Suppose that all necessary conditions hold. Let NN ∼ Nγ . Then:

‖ f (m)Nj − (f ∗j )(m) ‖2

j = Opr (N−2r ) ‖ fNj − (f ∗j ) ‖2j = Opr (N−2r )

‖ fN − (f ∗) ‖2= Opr (N−2r ) (Y N − µ)2 = Opr (N−2r )

The only reasonable derivatives are partial derivatives for the

same variable ( ∂2bfN

∂xj1∂xj2

= 0 for j1 6= j2).

Theorem holds for the redefinition of the mth derivative of theadditive function (linear combination of partial derivatives).

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 54: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Additive Expansion in L2 Norm

Consider and additive estimate fN of the regression function f .Set γ = 1/(2p + 1) and r = (p −m)/(2p + 1).

Theorem (Rate of Convergence for Additive Estimates)

Suppose that all necessary conditions hold. Let NN ∼ Nγ . Then:

‖ f (m)Nj − (f ∗j )(m) ‖2

j = Opr (N−2r ) ‖ fNj − (f ∗j ) ‖2j = Opr (N−2r )

‖ fN − (f ∗) ‖2= Opr (N−2r ) (Y N − µ)2 = Opr (N−2r )

The only reasonable derivatives are partial derivatives for the

same variable ( ∂2bfN

∂xj1∂xj2

= 0 for j1 6= j2).

Theorem holds for the redefinition of the mth derivative of theadditive function (linear combination of partial derivatives).

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 55: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Additive Expansion in L2 Norm

Consider and additive estimate fN of the regression function f .Set γ = 1/(2p + 1) and r = (p −m)/(2p + 1).

Theorem (Rate of Convergence for Additive Estimates)

Suppose that all necessary conditions hold. Let NN ∼ Nγ . Then:

‖ f (m)Nj − (f ∗j )(m) ‖2

j = Opr (N−2r ) ‖ fNj − (f ∗j ) ‖2j = Opr (N−2r )

‖ fN − (f ∗) ‖2= Opr (N−2r ) (Y N − µ)2 = Opr (N−2r )

The only reasonable derivatives are partial derivatives for the

same variable ( ∂2bfN

∂xj1∂xj2

= 0 for j1 6= j2).

Theorem holds for the redefinition of the mth derivative of theadditive function (linear combination of partial derivatives).

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 56: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Additive Expansion in L∞ Norm

The effect of the additive decomposition on the rate ofconvergence in supremum norm (r = p

2p+J −→ r = p2p+1 ).

To decompose not even the unknown regression function but thewhole regression problem (⇒ J univariate regression problems).

Theorem

Let all necessary conditions hold and let Nγ ∼ NN . Suppose thatEY = µ = 0 and let r = p

2p+1 . Then:

‖fN − f ∗‖∞ = supx∈[0,1]J

|fN(x)− f ∗(x)| = Opr (N−r · logr N) (3)

‖fNj − f ∗j ‖∞,j = supxj∈[0,1]

|fNj(xj)− f ∗j (xj)| = Opr (N−r · logr N) (4)

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 57: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Additive Expansion in L∞ Norm

The effect of the additive decomposition on the rate ofconvergence in supremum norm (r = p

2p+J −→ r = p2p+1 ).

To decompose not even the unknown regression function but thewhole regression problem (⇒ J univariate regression problems).

Theorem

Let all necessary conditions hold and let Nγ ∼ NN . Suppose thatEY = µ = 0 and let r = p

2p+1 . Then:

‖fN − f ∗‖∞ = supx∈[0,1]J

|fN(x)− f ∗(x)| = Opr (N−r · logr N) (3)

‖fNj − f ∗j ‖∞,j = supxj∈[0,1]

|fNj(xj)− f ∗j (xj)| = Opr (N−r · logr N) (4)

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 58: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

The Effectiveness of the Additive Expansion

0 2000 4000 6000 8000 10000

0.00

20.

006

0.01

00.

014

N observations

Rat

e of

Con

verg

ence

●●

●●

●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

●●

●●

●●

●●

●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ●Supremum norm

Euclidean norm

J=2

J=1

J=2J=1

Figure: The optimal global rate of convergence for the additive models in the case of twodimensional regression surface for the Supremum norm and the Euclidean norm.

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 59: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Optimal model selection

1 Spline estimates: - in a case of implementation of smoothingparameter λ one gets a set of “good” admissible models⇒ there come up a requirement to take only one

2 Penalized splines: - a set of all admissible models even moreincreases once we consider a minimization problem oversmoothing parameter λ and knots positions ∆ too.

3 Kernel regression: - a problem of a right selection of thesmoothing parameter h - a measure of localness(or a multiple bandwidth parameter h)

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 60: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Optimal model selection

1 Spline estimates: - in a case of implementation of smoothingparameter λ one gets a set of “good” admissible models⇒ there come up a requirement to take only one

2 Penalized splines: - a set of all admissible models even moreincreases once we consider a minimization problem oversmoothing parameter λ and knots positions ∆ too.

3 Kernel regression: - a problem of a right selection of thesmoothing parameter h - a measure of localness(or a multiple bandwidth parameter h)

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 61: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Optimal model selection

1 Spline estimates: - in a case of implementation of smoothingparameter λ one gets a set of “good” admissible models⇒ there come up a requirement to take only one

2 Penalized splines: - a set of all admissible models even moreincreases once we consider a minimization problem oversmoothing parameter λ and knots positions ∆ too.

3 Kernel regression: - a problem of a right selection of thesmoothing parameter h - a measure of localness(or a multiple bandwidth parameter h)

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 62: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Optimal model selection

1 Spline estimates: - in a case of implementation of smoothingparameter λ one gets a set of “good” admissible models⇒ there come up a requirement to take only one

2 Penalized splines: - a set of all admissible models even moreincreases once we consider a minimization problem oversmoothing parameter λ and knots positions ∆ too.

3 Kernel regression: - a problem of a right selection of thesmoothing parameter h - a measure of localness(or a multiple bandwidth parameter h)

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 63: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Model Selection Criteria

●●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●

●●

●●●●●●

●●●

●●●

0.0 0.2 0.4 0.6 0.8 1.0

1.0

1.5

2.0

2.5

3.0

x values

y va

lues

Smoothing parameters:

lambda = 0.000 008

Cross-Validation

Generalized Cross-Validation

Akaike information Criterion

Bayesian InformationCriterion

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 64: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Model Selection Criteria

●●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●

●●

●●●●●●

●●●

●●●

0.0 0.2 0.4 0.6 0.8 1.0

1.0

1.5

2.0

2.5

3.0

x values

y va

lues

Smoothing parameters:

lambda = 0.001 357

Cross-Validation

Generalized Cross-Validation

Akaike information Criterion

Bayesian InformationCriterion

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 65: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Model Selection Criteria

●●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●

●●

●●●●●●

●●●

●●●

0.0 0.2 0.4 0.6 0.8 1.0

1.0

1.5

2.0

2.5

3.0

x values

y va

lues

Smoothing parameters:

lambda = 0.037 821

Cross-Validation

Generalized Cross-Validation

Akaike information Criterion

Bayesian InformationCriterion

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 66: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Model Selection Criteria

●●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●

●●

●●●●●●

●●●

●●●

0.0 0.2 0.4 0.6 0.8 1.0

1.0

1.5

2.0

2.5

3.0

x values

y va

lues

Smoothing parameters:

lambda = 0.199 624

Cross-Validation

Generalized Cross-Validation

Akaike information Criterion

Bayesian InformationCriterion

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 67: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Model Selection Criteria

●●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●

●●

●●●●●●

●●●

●●●

0.0 0.2 0.4 0.6 0.8 1.0

1.0

1.5

2.0

2.5

3.0

x values

y va

lues

Smoothing parameters:

lambda = 1.053 625

Cross-Validation

Generalized Cross-Validation

Akaike information Criterion

Bayesian InformationCriterion

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 68: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Well-known algorithmsReal data - Example

Contents

1 IntroductionMotivationCurse of DimensionalityAdditive Decomposition

2 Additive RegressionSpline EstimatesKernel Estimates

3 Generalized Additive Regression

4 Rate of Convergence

5 Model Selection Criteria

6 Adaptive MethodsWell-known algorithmsReal data - Example

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 69: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Well-known algorithmsReal data - Example

Iterative methods - Backfittilng Algorithm

The first proposals → iterative methods(based on the additive decomposition method)

fj(xj) = E[Y − µ−

∑Jt=1, t 6=j ft(xt)|Xj

]

1 Initialization: µ0 = 1N

∑Ni=1 Yi , fj = f 0

j , j = 1, . . . , J

2 fj = Sj

[Y − µ0 −

∑k 6=j fk (Xk )|Xj

]µ0 = µ0 + 1

N

∑Ni=1 fj(Xij)

fj = fj − 1N

∑Ni=1 fj(Xij)

3 Repeat step 2 until sufficient convergence...

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 70: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Well-known algorithmsReal data - Example

Iterative methods - Backfittilng Algorithm

The first proposals → iterative methods(based on the additive decomposition method)

fj(xj) = E[Y − µ−

∑Jt=1, t 6=j ft(xt)|Xj

]

1 Initialization: µ0 = 1N

∑Ni=1 Yi , fj = f 0

j , j = 1, . . . , J

2 fj = Sj

[Y − µ0 −

∑k 6=j fk (Xk )|Xj

]µ0 = µ0 + 1

N

∑Ni=1 fj(Xij)

fj = fj − 1N

∑Ni=1 fj(Xij)

3 Repeat step 2 until sufficient convergence...

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 71: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Well-known algorithmsReal data - Example

Iterative methods - Backfittilng Algorithm

The first proposals → iterative methods(based on the additive decomposition method)

fj(xj) = E[Y − µ−

∑Jt=1, t 6=j ft(xt)|Xj

]

1 Initialization: µ0 = 1N

∑Ni=1 Yi , fj = f 0

j , j = 1, . . . , J

2 fj = Sj

[Y − µ0 −

∑k 6=j fk (Xk )|Xj

]µ0 = µ0 + 1

N

∑Ni=1 fj(Xij)

fj = fj − 1N

∑Ni=1 fj(Xij)

3 Repeat step 2 until sufficient convergence...

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 72: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Well-known algorithmsReal data - Example

Iterative methods - Backfittilng Algorithm

The first proposals → iterative methods(based on the additive decomposition method)

fj(xj) = E[Y − µ−

∑Jt=1, t 6=j ft(xt)|Xj

]

1 Initialization: µ0 = 1N

∑Ni=1 Yi , fj = f 0

j , j = 1, . . . , J

2 fj = Sj

[Y − µ0 −

∑k 6=j fk (Xk )|Xj

]µ0 = µ0 + 1

N

∑Ni=1 fj(Xij)

fj = fj − 1N

∑Ni=1 fj(Xij)

3 Repeat step 2 until sufficient convergence...

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 73: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Well-known algorithmsReal data - Example

Iterative methods - Backfittilng Algorithm

The first proposals → iterative methods(based on the additive decomposition method)

fj(xj) = E[Y − µ−

∑Jt=1, t 6=j ft(xt)|Xj

]

1 Initialization: µ0 = 1N

∑Ni=1 Yi , fj = f 0

j , j = 1, . . . , J

2 fj = Sj

[Y − µ0 −

∑k 6=j fk (Xk )|Xj

]µ0 = µ0 + 1

N

∑Ni=1 fj(Xij)

fj = fj − 1N

∑Ni=1 fj(Xij)

3 Repeat step 2 until sufficient convergence...

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 74: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Well-known algorithmsReal data - Example

Iterative methods - Backfittilng Algorithm

The first proposals → iterative methods(based on the additive decomposition method)

fj(xj) = E[Y − µ−

∑Jt=1, t 6=j ft(xt)|Xj

]

1 Initialization: µ0 = 1N

∑Ni=1 Yi , fj = f 0

j , j = 1, . . . , J

2 fj = Sj

[Y − µ0 −

∑k 6=j fk (Xk )|Xj

]µ0 = µ0 + 1

N

∑Ni=1 fj(Xij)

fj = fj − 1N

∑Ni=1 fj(Xij)

3 Repeat step 2 until sufficient convergence...

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 75: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Well-known algorithmsReal data - Example

Iterative techniques - Computer intensive methods

Recursive Partitioning Regression - RPR- spline estimate of zero degree- locally constant estimate with a great interpretability

MARS Algorithm - MARS- Multivariate adaptive spline estimates- modification of the RPR algorithm (continuity condition)

Projection Pursuit Regression - PPR- projection into the lower dimensions- additivity in a different sense

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 76: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Well-known algorithmsReal data - Example

Contents

1 IntroductionMotivationCurse of DimensionalityAdditive Decomposition

2 Additive RegressionSpline EstimatesKernel Estimates

3 Generalized Additive Regression

4 Rate of Convergence

5 Model Selection Criteria

6 Adaptive MethodsWell-known algorithmsReal data - Example

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 77: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Well-known algorithmsReal data - Example

Example: Polynomial Regression

Polynomial Regression Estimate- spline estimate of 3th degree.Life exp. v S12[Log(People/TV ), Log(People/physician)]

Log(people per physician)Log(people per TV)

Average Life E

xp.

Residual Sum of Squares:

[1] 10.56021

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 78: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Well-known algorithmsReal data - Example

Example: Additive Regression Model

Additive Regression Estimate - generalization of PPR- additive spline estimate of the 3rd degreeLife expectancy v S1[Log(People/TV )] + S2[Log(People/phys)]

Log(people per physician)Log(people per TV)

Average Life E

xp.

Residual Sum of Squares:

[1] 11.89261

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 79: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Well-known algorithmsReal data - Example

Example: Recursive Partitioning Regression

Recursive Partitioning Regression Estimate- locally constant estimate - spline of the 0rd degreeLife expectancy v

∑Vv=1 πv

∏{x∈Bv}

Log(people per physician)Log(people per TV)

Average Life E

xp.

Residual Sum of Squares:

[1] NaN

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 80: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Well-known algorithmsReal data - Example

Example: MARS Algorithm

Multivariate Adaptive Regression Splines - MARS- modification of the RPR algorithmLife expectancy v s0 +

∑Vv=1 svBv (x)

Log(people per physician)Log(people per TV)

Average Life E

xp.

Residual Sum of Squares:

[1] 11.32777

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 81: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Well-known algorithmsReal data - Example

Example: Projection Pursuit Regression

Projection Pursuit Regression - PPR- projection into the lower dimensionsLife expectancy v

∑Vv=1 gv (bT

v x)

Log(people per physician)Log(people per TV)

Average Life E

xp.

Residual Sum of Squares:

[1] 6.129001

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models

Page 82: Semiparametric and Nonparametric Additive Regression Modelsartax.karlin.mff.cuni.cz/~macim1am/pub/obhajoba.pdf · Introduction Additive Regression Generalized Additive Regression

IntroductionAdditive Regression

Generalized Additive RegressionRate of Convergence

Model Selection CriteriaAdaptive Methods

Well-known algorithmsReal data - Example

Additive Regression Models with Regression Splines

Thank you for your attention...

Matúš Maciak: [email protected]

Matúš Maciak MFF UK - [email protected] Semiparametric and Nonparametric Additive Regression Models