Overview of Non-Parametric
Probability Density Estimation Methods
Sherry TowersState University of New York
at Stony Brook
S.Towers
All kernal PDF
estimation methods (PDE’s) are developed from a simple idea…
If a data point lies in a region where clustering of signal MC is tight, and bkgnd MC is loose, the point is likely to be signal
S.Towers
To estimate a PDF, PDE’s
use the idea that any continuous function can be modelled by sum of some “kernal” function
Gaussian kernals are a good choice for particle physics
So, a PDF can be estimated by sum of multi-dimensional Gaussians centred about MC generated points
S.Towers
Best form of Gaussian kernal is a matter of debate:
Static-kernal PDE method uses a kernal with covariance matrix obtained from entire sample
The Gaussian Expansion Method (GEM), uses an adaptive kernal; the covariance matrix used for the Gaussian at each MC point comes from “local” covariance matrix.
S.Towers
S.Towers
GEM vs Static-Kernal PDE
GEM gives unbiased estimate of PDF, but slower to use because local covariance must be calculated for each MC point
Static-kernal PDE methods have smaller variance, and are faster to use, but yield biased estimates of the PDF
S.Towers
Comparison of GEM and static-kernal PDE:
S.Towers
PDE vs Neural Networks
Both PDE’s and Neural Networks can take into account non-linear correlations in parameter space
Both methods are, in principle, equally powerful
For most part they perform similarly in an “average” analysis
S.Towers
PDE vs Neural Networks
But, PDE’s have far fewer parameters, and algorithm is more intuitive in nature (easier to understand)
S.Towers
Plus, PDE estimate of PDF can be visually examined:
S.Towers
PDE’s vs Neural Nets…
There are some problems that are particularly well suited to PDE’s:
S.Towers
PDE’s vs Neural Nets…
S.Towers
PDE’s vs Neural Nets…
S.Towers
PDE’s vs Neural Nets…
S.Towers
Summary
PDE methods are as powerful as neural networks, and offer an interesting alternative
Very few parameters, easy to use, easy to understand, and yield unbinned estimate of PDF that user can examine in the multidimensional parameter space!