Effective sample size

November 21, 2016 — March 3, 2020

dynamical systems

model selection

Monte Carlo

signal processing

statistics

time series

Suspiciously similar content

We have an estimator \(\hat{\theta}\) of some statistic which is, for the sake of argument, presumed to be the mean calculated from observations of some stochastic process \(\mathsf{v}\). Under certain assumptions, we can use central limit theorems to find that the variance of our estimator calculated from \(N\) i.i.d. samples is given by \(\operatorname{Var}(\hat{\theta})\propto 1/N.\) Effective Sample Size (ESS) gives us a different \(N\), \(N_{\text{Eff}}\) such that \(\operatorname{Var}(\hat{\theta})\propto 1/N_{\text{Eff}}.\)

1 Statistics

When your experiment design (e.g. because it is a time series, or because of non-random sampling) is highly correlated, your data might give you less information than you hope, or expect from the uncorrelated case, with regard to a particular statistic you wish to calculate. In practice, all the introductions use the mean.

This is a kind of dual to effective degrees of freedom, which tells you how far your sample size can get you.

Turns out to be important in, e.g. LASSO and, circularly, covariance estimation.

Tom Leinster, effective sample size.

Huber’s (1981) “equivalent number of observations” is probably the same?

2 Monte Carlo estimation

Related, but a slightly different setup. Not about experimental samples, but the number of simulations in simulation-based inference where you are using e.g. importance sampling or a sequential Markov chain sampler. Sebastian Nowozin, Effective Sample Size in Importance Sampling, Kenneth Tay, Effective sample size for Markov Chain Monte Carlo.

[The effective sample size] can be used after or during importance sampling to provide a quantitative measure of the quality of the estimated mean. Even better, the estimate is provided on a natural scale of worth in samples from p, that is, if we use \(n=1000\) samples \(X_i\sim q\) and obtain an ESS of say 350 then this indicates that the quality of our estimate is about the same as if we would have used 350 direct samples.

Since Markov Chain Monte Carlo is so common in statistical inference these days, effective sample size for simulations might be a more common use of effective sample size in statistics than the directly statistical notion of the term.

In practice, we usually cargo cult in the formulae for ESS from a paper on how to do it best. The Stan method for ESS in MCMC, which is a best-practice method AFAICT, is based on the autocorrelogram.

\[ \tau = 1 + 2 \sum_{t=1}^{m} \rho_{t} \] and

\[N_{\text{eff}}=\frac{N}{\tau}.\]

Here \(\rho_{t}\) is the autocorrelation at a given lag. We are implicitly assuming the statistic of interest here is the mean. A short calculation should persuade us that this gives us the convergence rate we expect, but TBH I have not done that here. It seems plausible at least.

In practice, you are doing Markov Chain Monte Carlo because the problem is difficult enough that you cannot calculate an effective sample size analytically. So you estimate it, which means, in turn, estimating those autocorrelations. We can use FFT to calculate correlation (Geyer 2011). There are various fiddly details, especially if running multiple Markov chains.

3 References

Faes, Molenberghs, Aerts, et al. 2009. “The Effective Sample Size and an Alternative Small-Sample Degrees-of-Freedom Method.” The American Statistician.

Fang, Cao, and Skeel. 2017. “Quasi-Reliable Estimates of Effective Sample Size.” arXiv:1705.03831 [Stat].

Gelman, Carlin, Stern, et al. 2013. Bayesian Data Analysis. Chapman & Hall/CRC texts in statistical science.

Geyer. 1992. “Practical Markov Chain Monte Carlo.” Statistical Science.

———. 2011. “Introduction to Markov Chain Monte Carlo.” In Handbook of Markov Chain Monte Carlo.

Kong. 1992. “A Note on Importance Sampling Using Standardized Weights.”

Lenth. 2001. “Some Practical Guidelines for Effective Sample Size Determination.” The American Statistician.

Liu. 1996. “Metropolized Independent Sampling with Comparisons to Rejection Sampling and Importance Sampling.” Statistics and Computing.

Thiébaux, and Zwiers. 1984. “The Interpretation and Estimation of Effective Sample Size.” Journal of Climate and Applied Meteorology.

Vehtari, Gelman, Simpson, et al. 2020. “Rank-Normalization, Folding, and Localization: An Improved \(\widehat{R}\) for Assessing Convergence of MCMC.” arXiv:1903.08008 [Stat].