A nonparametric method of approximating something from data by assuming that it’s close to the data distribution convolved with some kernel.
This is especially popular when the target is a probability density function; then you are working with a kernel density estimator.
To learn about:
Bandwidth/kernel selection in density estimation
Bernacchia and Pigolotti (2011) has a neat hack: “self consistency” for simultaneous kernel and distribution inference, i.e. simultaneous deconvolution and bandwidth selection. The idea is removing bias by using simple spectral methods, thereby estimating a kernel which in a certain sense would generate the data that you just observed. The results look similar to finite-sample corrections for Gaussian scale parameter estimates but are not quite Gaussian.
Question: could it work with mixture models too?
Mixture models
Where the number of kernels does not grow as fast as the number of data points, this becomes a mixture model; or, if you’d like, kernel density estimates are a limiting case of mixture model estimates.
They are so clearly similar that I think it best we not make them both feel awkward by dithering about where the free parameters are. Anyway, they are filed separately. (Battey and Liu 2013; van de Geer 1996; Zeevi and Meir 1997) discuss some useful things common to various convex combination estimators.
Does this work with uncertain point locations?
The fact we can write the kernel density estimate as an integral with a convolution of Dirac deltas immediately suggests that we could write it as a convolution of something else, such as Gaussians. Can we recover well-behaved estimates in that case? This would be a kind of hierarchical model, possibly a typical Bayesian one.
Does this work with asymmetric kernels?
Almost all the kernel estimates I’ve seen require KDEs to be symmetric, because of Cline’s argument that asymmetric kernels are inadmissible in the class of all (possibly multivariate) densities. Presumably this implies \(\mathcal{C}_1\) distributions, i.e. once-differentiable ones. In particular, admissible kernels are those which have “nonnegative Fourier transforms bounded by 1,” which implies symmetry about the axis. If we have an a priori constrained class of densities, this might not apply.
References
Baddeley, Adrian, and Turner. 2006.
“Modelling Spatial Point Patterns in R.” In
Case Studies in Spatial Point Process Modeling. Lecture Notes in Statistics 185.
Baddeley, A., Turner, Møller, et al. 2005.
“Residual Analysis for Spatial Point Processes (with Discussion).” Journal of the Royal Statistical Society: Series B (Statistical Methodology).
Bashtannyk, and Hyndman. 2001.
“Bandwidth Selection for Kernel Conditional Density Estimation.” Computational Statistics & Data Analysis.
Battey, and Liu. 2013.
“Smooth Projected Density Estimation.” arXiv:1308.3968 [Stat].
Berman, and Diggle. 1989.
“Estimating Weighted Integrals of the Second-Order Intensity of a Spatial Point Process.” Journal of the Royal Statistical Society. Series B (Methodological).
Bernacchia, and Pigolotti. 2011.
“Self-Consistent Method for Density Estimation.” Journal of the Royal Statistical Society: Series B (Statistical Methodology).
Botev, Grotowski, and Kroese. 2010.
“Kernel Density Estimation via Diffusion.” The Annals of Statistics.
Diggle, Peter. 1985.
“A Kernel Method for Smoothing Point Process Data.” Journal of the Royal Statistical Society. Series C (Applied Statistics).
Doosti, and Hall. 2015.
“Making a Non-Parametric Density Estimator More Attractive, and More Accurate, by Data Perturbation.” Journal of the Royal Statistical Society: Series B (Statistical Methodology).
Ellis. 1991.
“Density Estimation for Point Processes.” Stochastic Processes and Their Applications.
Greengard, and Strain. 1991.
“The Fast Gauss Transform.” SIAM Journal on Scientific and Statistical Computing.
Helmers, Wayan Mangku, and Zitikis. 2003.
“Consistent Estimation of the Intensity Function of a Cyclic Poisson Process.” Journal of Multivariate Analysis.
Ibragimov. 2001.
“Estimation of Analytic Functions.” In
Institute of Mathematical Statistics Lecture Notes - Monograph Series.
Koenker, and Mizera. 2006.
“Density Estimation by Total Variation Regularization.” Advances in Statistical Modeling and Inference.
Malec, and Schienle. 2014.
“Nonparametric Kernel Density Estimation Near the Boundary.” Computational Statistics & Data Analysis.
O’Brien, Kashinath, Cavanaugh, et al. 2016.
“A Fast and Objective Multidimensional Kernel Density Estimation Method: fastKDE.” Computational Statistics & Data Analysis.
Panaretos, and Konis. 2012.
“Nonparametric Construction of Multivariate Kernels.” Journal of the American Statistical Association.
Stein. 2005.
“Space-Time Covariance Functions.” Journal of the American Statistical Association.
van Lieshout. 2011.
“On Estimation of the Intensity Function of a Point Process.” Methodology and Computing in Applied Probability.
Yang, Duraiswami, and Davis. 2004.
“Efficient Kernel Machines Using the Improved Fast Gauss Transform.” In
Advances in Neural Information Processing Systems.
Yang, Duraiswami, Gumerov, et al. 2003.
“Improved Fast Gauss Transform and Efficient Kernel Density Estimation.” In
Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2. ICCV ’03.
Zeevi, and Meir. 1997.
“Density Estimation Through Convex Combinations of Densities: Approximation and Estimation Bounds.” Neural Networks: The Official Journal of the International Neural Network Society.
Zhang, and Karunamuni. 2010.
“Boundary Performance of the Beta Kernel Estimators.” Journal of Nonparametric Statistics.