Monte Carlo methods

November 16, 2014 — August 25, 2023

Bayes

estimator distribution

Monte Carlo

probabilistic algorithms

probability

Suspiciously similar content

Finding functionals (traditionally integrals) approximately by guessing cleverly. Often, but not always, used for approximate statistical inference, especially certain Bayesian techniques. Probably the most prominent use case is Bayesian statistics, where various Monte Carlo methods turn out to be effective for various inference problems, especially Markov Chain ones. This is far from the only use, however.

1 MC theory

General MC theory is big! As a taster, try Better than Monte Carlo, an incredibly compact introduction, explaining (Chopin and Gerber 2023; Novak 2015):

Say I want to approximate the integral \[ I(f):=\int_{[0,1]^s} f(u) d u \] based on \(n\) evaluations of function \(f\). I could use plain old Monte Carlo: \[ \hat{I}(f)=\frac{1}{n} \sum_{i=1}^n f\left(U_i\right), \quad U_i \sim \mathrm{U}\left([0,1]^s\right) . \] whose RMSE (root mean square error) is \(O\left(n^{-1 / 2}\right)\). Can I do better? That is, can I design an alternative estimator/algorithm, which performs \(n\) evaluations and returns a random output, such that its RMSE converges quicker? Surprisingly, the answer to this question has been known for a long time. If I am ready to focus on functions \(f \in \mathcal{C}^r\left([0,1]^s\right)\), Bakhvalov (1959) showed that the best rate I can hope for is \(O\left(n^{-1 / 2-r / s}\right)\). That is, there exist algorithms that achieve this rate, and algorithms achieving a better rate simply do not exist.

More lavish introduction: Reuven Y. Rubinstein and Kroese (2016),Kroese, Taimre, and Botev (2011), Art Owen’s Monte Carlo theory, methods and examples

2 Markov chain samplers

See Markov Chain Monte Carlo.

3 Multi-level Monte Carlo

Hmmm. Also multi scale monte carlo, multi index monte carlo. 🚧TODO🚧 clarify

4 Population Monte Carlo

Not sure. See Cappé et al. (2004).

Importance sampling methods can be iterated like MCMC algorithms, while being more robust against dependence and starting values. The population Monte Carlo principle consists of iterated generations of importance samples, with importance functions depending on the previously generated importance samples. The advantage over MCMC algorithms is that the scheme is unbiased at any iteration and can thus be stopped at any time, while iterations improve the performances of the importance function, thus leading to an adaptive importance sampling.

5 Sequential Monte Carlo

Filed under particle filters.

6 Quasi Monte Carlo

Don’t even guess randomly, but sample cleverly using the shiny Quasi Monte Carlo.

7 Cross Entropy Method

For automatically adapting an importance sampling distribution. TBC.

8 Ratio of uniforms

See Ratio of uniforms

9 Monte Carlo gradient estimation

See MC gradients

10 Incoming

Monte Carlo theory, methods and examples

11 References

Anderson, and Higham. 2012. “Multilevel Monte Carlo for Continuous Time Markov Chains, with Applications in Biochemical Kinetics.” Multiscale Modeling & Simulation.

Andrieu, and Thoms. 2008. “A Tutorial on Adaptive MCMC.” Statistics and Computing.

Atchadé, Fort, Moulines, et al. 2011. “Adaptive Markov Chain Monte Carlo: Theory and Methods.” In Bayesian Time Series Models.

Cappé, Guillin, Marin, et al. 2004. “Population Monte Carlo.” Journal of Computational and Graphical Statistics.

Casella, and Robert. 1996. “Rao-Blackwellisation of Sampling Schemes.” Biometrika.

Chopin, and Gerber. 2023. “Higher-Order Stochastic Integration Through Cubic Stratification.”

Cranmer, Brehmer, and Louppe. 2020. “The Frontier of Simulation-Based Inference.” Proceedings of the National Academy of Sciences.

Elvira, and Chouzenoux. 2021. “Optimized Population Monte Carlo.”

Geffner, and Domke. 2018. “Using Large Ensembles of Control Variates for Variational Inference.” In Proceedings of the 32nd International Conference on Neural Information Processing Systems. NIPS’18.

Giles, Michael B. 2008. “Multilevel Monte Carlo Path Simulation.” Operations Research.

Giles, Mike, and Szpruch. 2012. “Multilevel Monte Carlo Methods for Applications in Finance.” arXiv:1212.1377 [q-Fin].

Giles, Michael B., and Szpruch. 2014. “Antithetic Multilevel Monte Carlo Estimation for Multi-Dimensional SDEs Without Lévy Area Simulation.” The Annals of Applied Probability.

Gu, Ghahramani, and Turner. 2015. “Neural Adaptive Sequential Monte Carlo.” In Advances in Neural Information Processing Systems 28.

Haji-Ali, Nobile, and Tempone. 2016. “Multi-Index Monte Carlo: When Sparsity Meets Sampling.” Numerische Mathematik.

Higham. 2015. “An Introduction to Multilevel Monte Carlo for Option Valuation.” arXiv:1505.00965 [Physics, q-Fin, Stat].

Kroese, Taimre, and Botev. 2011. Handbook of Monte Carlo Methods. Wiley Series in Probability and Statistics 706.

Liu. 1996. “Metropolized Independent Sampling with Comparisons to Rejection Sampling and Importance Sampling.” Statistics and Computing.

Mohamed, Rosca, Figurnov, et al. 2020. “Monte Carlo Gradient Estimation in Machine Learning.” Journal of Machine Learning Research.

Novak. 2015. “Some Results on the Complexity of Numerical Integration.”

Robert, and Casella. 2004. Monte Carlo Statistical Methods. Springer Texts in Statistics.

Rubinstein, Reuven Y, and Kroese. 2004. The Cross-Entropy Method a Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation and Machine Learning.

Rubinstein, Reuven Y., and Kroese. 2016. Simulation and the Monte Carlo Method. Wiley series in probability and statistics.

Rubinstein, Reuven Y., Ridder, and Vaisman. 2014. Fast Sequential Monte Carlo Methods for Counting and Optimization. Wiley Series in Probability and Statistics.

Vehtari, Simpson, Gelman, et al. 2019. “Pareto Smoothed Importance Sampling.” arXiv:1507.02646 [Stat].

Virrion. 2020. “Deep Importance Sampling.” arXiv:2007.02692 [q-Fin].

Walder, Roussel, Nock, et al. 2019. “New Tricks for Estimating Gradients of Expectations.” arXiv:1901.11311 [Cs, Stat].

Xia. 2011. “Multilevel Monte Carlo Method for Jump-Diffusion SDEs.” arXiv:1106.4730 [q-Fin].