9 September 2021, Amphithéâtre Gaston Darboux, Institut Henri Poincaré, Paris
Conference in honour of
Christian Robert's 60th birthday
Computational Bayesian Statistics
Organising and Scientific Committees
Jean-Michel Marin, Professor University of Montpellier
Eric Moulines, Professor Ecole Polytechnique
Judith Rousseau, Professor University of Oxford
Most real-life applications of statistics require the use of computational methods. This is particularly true for Bayesian statistics, where the output of an analysis is the posterior distribution — a distribution over models and parameters that quantifies the uncertainty of inferences from the data. In almost all applications the posterior distribution is intractable, and instead of analytical evaluation of probabilities and expectations we use algorithms to get samples approximately distributed from the posterior distribution. These algorithms are important in many areas of statistics, particularly when we need to average over uncertainty within our statistical models.
Over the past forty years, Christian Robert has enormously contributed to the development of this type of algorithms, in particular MCMC strategies, importance sampling methods, ABC techniques, proposing new and very innovative methodologies but also studying theoretically their properties. On 9 September 2021, Christian will celebrate his 60th birthday. To mark this occasion, we are organising a scientific afternoon with three presentations on the themes described above.
Senior Researcher at Inria Grenoble Rhone-Alpes
Head of team Statify
Full Professor of Applied Statistics and Econometrics at the Wirtschaftsuniverstiät - Vienna University of Economics and Business
John L. Loeb Associate Professor of the Natural Sciences at Harvard University (up to August 2021)
Full Professor of Statistics at ESSEC Business School in Paris (from 1st September 2021)
2pm30 - 2pm45 (UTC+2 Paris) Welcome - Openning
Approximate Bayesian Computation with surrogate posteriors
2pm45 - 3pm35 (UTC+2 Paris)
A key ingredient in approximate Bayesian computation (ABC) procedures is the choice of a discrepancy that describes how different the simulated and observed data are, often based on a set of summary statistics when the data cannot be compared directly. Unless discrepancies and summaries are available from experts or prior knowledge, which seldom occurs, they have to be chosen and this can affect the quality of approximations. The choice between discrepancies is an active research topic, which has mainly considered data discrepancies requiring samples of observations or distances between summary statistics. In this work, we introduce a preliminary learning step in which surrogate posteriors are built from finite Gaussian mixtures, using an inverse regression approach. These surrogate posteriors are then used in place of summary statistics and compared using metrics between distributions in place of data discrepancies. Two such metrics are investigated, a standard L$_2$ distance and an optimal transport-based distance. The whole procedure can be seen as an extension of the semi-automatic ABC framework to functional summary statistics setting and can also be used as an alternative to sample-based approaches. The resulting ABC quasi-posterior distribution is shown to converge to the true one, under standard conditions. Performance is illustrated on both synthetic and real data sets, where it is shown that our approach is particularly useful, when the posterior is multimodal.
3pm45 - 4pm00 (UTC+2 Paris) Stories
Some methods based on couplings of Markov chain Monte Carlo algorithms
4pm00 - 4pm50 (UTC+2 Paris)
Markov chain Monte Carlo algorithms are commonly used to approximate a variety of probability distributions, such as posterior distributions arising in Bayesian analysis. In the five thousand slides of this talk I will review the idea of coupling in the context of Markov chains, and how this idea not only leads to theoretical analyses of Markov chains, but also to new Monte Carlo methods. In particular, the talk will describe how coupled Markov chains can be used to obtain 1) unbiased estimators of expectations, with applications to the "cut distribution" and to normalizing constant estimation, 2) non-asymptotic convergence diagnostics for Markov chains, and 3) unbiased estimators of the asymptotic variance of an MCMC ergodic average.
5pm00 - 5pm50 (UTC+2 Paris)
From here to infinity -- bridging finite and Bayesian nonparametric mixture models in model-based clustering
This talk reviews the concept of mixture models and their application for model-based clustering from a Bayesian perspective and discusses some recent developments.
Two broad classes of mixture models are available. On the one hand, finite mixture models are employed with a finite number K of components in the mixture distribution. On the other hand, Bayesian nonparametric mixtures, in particular Dirichlet process and Pitman-Yor process mixtures, are very popular. These models admit infinitely many mixture components and imply a prior distribution on the partition of the data, with a random number of data clusters. This allows to derive the posterior distribution of the number of clusters given the data which contains useful information regarding unobserved heterogeneity in the data.
One reason for the popularity of Bayesian nonparametric mixtures is the common belief that finite mixture models are different in this regard and that, by selecting the number K of components in the mixture distribution, the number of data clusters is automatically forced to be equal to K. However, recent research in finite mixture models has revealed surprising similarities between finite and Bayesian nonparametric mixture models. It has been shown that also for finite mixtures there exists a pronounced difference between the number of components in the mixture distribution and the number of clusters in the data, in particular, if the mixture model is overfitting.
The concentration parameter in the Dirichlet prior on the mixture weights is instrumental in this respect and, for appropriate choices, finite mixture models also imply a prior distribution on the partition of the data with a random number of data clusters. In addition, a prior can be put on the number K of components in the mixture distribution. This allows to infer simultaneously K and the number of data clusters from the data within the framework of generalized mixtures of finite mixtures, recently introduced by Frühwirth-Schnatter, Malsiner-Walli and Grün (arXiv preprint 2005.09918v2). This model class encompasses many well-known mixture modelling frameworks, including Dirichlet process and sparse finite mixtures. A new, generic MCMC sampler (called telescoping sampler) is introduced that allows straightforward MCMC implementation and avoids the tedious design of moves in common trans-dimensional approaches such as reversible jump MCMC.
(this talk is based on joint work with Jan Greve, Bettina Grün, and Gertraud Malsiner-Walli and supported by the Austrian Science Fund (FWF), grant P28740)
5pm50 - 6pm00 (UTC+2 Paris) Stories - Closing