Weighted data, weighted likelihoods in statistics

November 4, 2020 — August 7, 2024

estimator distribution

functional analysis

linear algebra

model selection

probability

signal processing

sparser than thou

statistics

uncertainty

Suspiciously similar content

1 Bestiary

Thomas Lumley helpfully disambiguates the “three and a half distinct uses of the term weights in statistical methodology”.

The three main types of weights are

the ones that show up in the classical theory of weighted least squares. These describe the precision (1/variance) of observations. … I call these precision weights; Stata calls them analytic weights.
the ones that show up in categorical data analysis. These describe cell sizes in a data set, so a weight of 10 means that there are 10 identical observations in the dataset, which have been compressed to a covariate pattern plus a count. … Stata calls these frequency weights, and so do I.
the ones that show up in classical survey sampling theory. These describe how the sample can be scaled up to the population. Classically, they were the reciprocals of sampling probabilities, so an observation with a weight of 10 was sampled with probability 1/10, and represents 10 people in the population. In real life, these are typically more complicated than just sampling probabilities, but they play the same role of trying to rescale the sample distribution to match the population distribution. I call these sampling weights; Stata calls them probability weights, other people call them design weights or grossing-up weights.

The mean formula for each is the same, but not the variance.

🚧TODO🚧 clarify: I know that iteratively reweighted least squares fitting is a thing and that it corresponds to robust Gaussian belief propagation but… why is it not common for other log-additive likelihoods in frequentist statistics? Why do I not have other precision weights for my estimates? This looks relatively simple and I am not sure why it is not more common.

Update: it is more common, occurring in robust statistics and in iteratively reweighted least squares for generalized linear models. Mathematically it is the same as annealing.

What is less common is a Bayesian interpretation.

2 Bayesian reweighting

Interesting, and surprisingly recent. Ends up being equivalent to tempering. See Wang, Kucukelbir, and Blei (2017).

3 Incoming

How to incorporate observation weights into an estimator

4 References

Barron. 2019. “A General and Adaptive Robust Loss Function.” In.

Boyer, McDonald, and Newey. 2003. “A Comparison of Partially Adaptive and Reweighted Least Squares Estimation.” Econometric Reviews.

Sato, Owens, and Prosper. 2014. “Bayesian Reweighting for Global Fits.” Physical Review D.

Wang, Kucukelbir, and Blei. 2017. “Robust Probabilistic Modeling with Bayesian Data Reweighting.” In Proceedings of the 34th International Conference on Machine Learning.

Zhang. 1997. “Parameter Estimation Techniques: A Tutorial with Application to Conic Fitting.” Image and Vision Computing.