(Weighted) least squares fits
September 22, 2016 — October 3, 2022
A classic. Surprisingly deep.
A few non-comprehensive notes on approximating functions from data by the arbitrary-but-convenient expedient of minimising the sum of the squares of the deviations between two things; The linear algebra of least squares fits seems well-trodden and perennially classic. Used in many problems. e.g. lasso regression, Gaussian belief propagation. Least squares problems are a kind of platonic ideal of convex optimisation.
Francis Bach is interested in least squares relaxations, which he connects to Fourier domain. See his Sums-of-squares for dummies: a view from the Fourier domain.
1 Introduction
- the Ceres Solver Bibliography is a good start.
- Boyd and Vandenberghe’s Introduction to Applied Linear Algebra: Vectors, Matrices, and Least Squares is a solid introduction to both linear algebra and least squares. Julia Companion. Python Companion.
2 Iteratively reweighted
3 Nonlinear least squares
Trust region and Levenberg-Marquardt methods in 2nd order optimisation.
4 Tools
Nonlinear least squares with ceres-solver:
Ceres Solver is an open-source C++ library for modeling and solving large, complicated optimization problems. It can be used to solve Non-linear Least Squares problems with bounds constraints and general unconstrained optimization problems. It is a mature, feature-rich, and performant library that has been used in production at Google since 2010.
gradslam gradslam/gradslam: gradslam is an open-source differentiable dense SLAM library for PyTorch (Jatavallabhula, Iyer, and Paull 2020)
5 Jaxopt
jax toolkit JAXopt includes lots of neat Nonlinear least squares tooling.
6 KeOps
The KeOps library lets you compute reductions of large arrays whose entries are given by a mathematical formula or a neural network. It combines efficient C++ routines with an automatic differentiation engine and can be used with Python (NumPy, PyTorch), Matlab, and R.
It is perfectly suited to the computation of kernel matrix-vector products, K-nearest neighbors queries, N-body interactions, point cloud convolutions, and the associated gradients. Crucially, it performs well even when the corresponding kernel or distance matrices do not fit into the RAM or GPU memory. Compared with a PyTorch GPU baseline, KeOps provides a x10-x100 speed-up on a wide range of geometric applications, from kernel methods to geometric deep learning.
7 Incoming
- The Gauss–Markov Theorem “under certain conditions, the least squares estimator is the minimum-variance linear unbiased estimator of the model parameters.”
- Weighted Least Squares