## BayesBag, and how to approximate it

Hi all,

This post describes how unbiased MCMC can help in approximating expectations with respect to “BayesBag”, an alternative to standard posterior distributions mentioned in Peter Bühlmann‘s discussion of Big Bayes Stories (which was a special issue of Statistical Science). Essentially BayesBag is the result of “bagging” applied to “Bayesian inference”. In passing, here is an R script implementing this on a model written in the Stan language (as in this previous post), namely a Negative Binomial regression, and using a pure R implementation of unbiased HMC (joint work with Jeremy Heng). The script produces the following figure: which shows, for two parameters of the model, the cumulative distribution function (CDF) under standard Bayes (blue thin line) and under BayesBag (wider red line). BayesBag results in distributions on the parameter space that are more “spread out” than standard Bayes.

Continue reading “BayesBag, and how to approximate it”

## Coding algorithms in R for models written in Stan

Hi all,

On top of recommending the excellent autobiography of Stanislaw Ulam, this post is about using the software Stan, but not directly to perform inference, instead to obtain R functions to evaluate a target’s probability density function and its gradient. With which, one can implement custom methods, while still benefiting from the great work of the Stan team on the “modeling language” side. As a proof of concept I have implemented a plain Hamiltonian Monte Carlo sampler for a random effect logistic regression model (taken from a course on Multilevel Models by Germán Rodríguez), a coupling of that HMC algorithm (as in “Unbiased Hamiltonian Monte Carlo with couplings“, see also this very recent article on the topic of coupling HMC), and then upper bounds on the total variation distance between the chain and its limiting distribution, as in “Estimating Convergence of Markov chains with L-Lag Couplings“.

The R script is here: https://github.com/pierrejacob/statisfaction-code/blob/master/2019-09-stan-logistic.R and is meant to be as simple as possible, and self-contained; warning, this is all really proof of concept and not thoroughly tested.

Basically the R script starts like a standard script that would use rstan for inference; it runs the default algorithm of Stan for a little while, then extracts some info from the “stanfit” object.  With these, a pure R implementation of TV upper bounds for a naive HMC algorithm follows, that relies on functions called “stan_logtarget”  and “stan_gradlogtarget” to evaluate the target log-pdf and its gradient.

The script takes a few minutes to run in total. Some time is first needed to compile the Stan code, and to run Stan for a few steps. Then some time spent towards the end of the script to generate 250 independent meeting times with a lag of 500 between the chains; the exact run time will of course depend a lot on your number of available processors (on my machine it takes around one minute). The script produces this plot: This plot suggests that vanilla HMC as implemented in the script converges in less than 1000 iterations to its stationary distribution. This is probably quite conservative, but it’s still usable.

In passing, upon profiling the code of the function that generates each meeting time, it appears that half of the time is spent in Stan‘s “grad_log_prob” function (which computes the gradient of the log pdf of the target). This implies that not that much efficiency is lost in the fact that the algorithms are coded in pure R, at least for this model.

## Estimating convergence of Markov chains

Hi all,

Niloy Biswas (PhD student at Harvard) and I have recently arXived a manuscript on the assessment of MCMC convergence (using couplings!). Here I’ll describe the main result, and some experiments (that are not in the current version of the paper) revisiting a 1996 paper by Mary Kathryn Cowles and Jeff Rosenthal entitled “A simulation approach to convergence rates for Markov chain Monte Carlo algorithms“. Code in R producing the figures of this post is available here.

Continue reading “Estimating convergence of Markov chains”

## particles

I have released a few months ago a Python package for particle filtering, called particles; you can find it on Github here. You may want to have a look first at the documentation, in particular the tutorials here.

This package has been developed to support our (with Omiros Papaspiliopoulos) forthcoming book called (tentatively): an introduction to Sequential Monte Carlo. It implements all the algorithms discussed in the book; e.g.

• bootstrap, guided and auxiliary particle filters
• all standard resampling schemes
• most particle smoothing algorithms
• sequential quasi-Monte Carlo
• PMCMC (PMMH, Particle Gibbs), SMC^2
• SMC samplers

It also contains all the scripts that were used to perform the numerical experiments discussed in the book. A random plot taken from the forthcoming book. Can you guess what it represents exactly?

This package is hopefully useful to people with different expectations and level of expertise. For instance, if you just want to run a particle filter for a basic state-space model, you may describe that model as follows:

```import particles
from particles import state_space_models as ssm

class ToySSM(ssm.StateSpaceModel):
def PX0(self):  # Distribution of X_0
return dists.Normal()  # X_0 ~ N(0, 1)
def PX(self, t, xp):  # Distribution of X_t given X_{t-1}
return dists.Normal(loc=xp)  # X_t ~ N( X_{t-1}, 1)
def PY(self, t, xp, x):  # Distribution of Y_t given X_t (and X_{t-1})
return dists.Normal(loc=x, scale=self.sigma)  # Y_t ~ N(X_t, sigma^2)
```

And then simulate data, and run the corresponding bootstrap filter, as follows:

```my_model = ToySSM(sigma=0.2)
x, y = my_model.simulate(200)  # sample size is 200

alg = particles.SMC(fk=ssm.Bootstrap(ssm=my_model, data=y), N=200)
alg.run()
```

On the other hand, if you are an SMC expert, you may re-use only the parts you need; e.g. a resampling scheme:

```from particles import resampling

A = resampling.systematic(W)
```

Up to now, this package has been tested mostly by my PhD students, and the students of my M2 course on particle filtering at the ENSAE; many thanks to all of them. Since no computer screen has been smashed in the process, I guess I can publicize it a bit more. Please let me know if you have any questions, comments, or feature request. (You may report a bug by raising an issue on the Github page.)

Based on your feedback, I’m planning to write a few more posts in the coming weeks about particles and more generally numerical computation in Python. Stay tuned!

## Budget constrained simulations

Hi all,

This post is about some results from “Bias Properties of Budget Constrained Simulations“, by Glynn & Heidelberger and published in Operations Research in 1990. I have found these results extremely useful, and our latest manuscript on unbiased MCMC recalls them in detail. Below I go through some of the results and describe the simulations that lead to the above figure.

## Another update on unbiased smoothing Hi all,

This is a short update on my research on unbiased smoothing with coupled conditional particle filters. In a previous post I naively explained that I was done with the project since the article was accepted for publication in a journal.

However, a bug was found in the code thanks to a very careful reader. So I’ve fixed the code, re-launched all the simulations, and updated the arXiv version straight away; that was on September 5, 2018. The conclusions of the article are unchanged, thankfully; in fact, the text of the article is exactly the same, only the numerical values in the tables and the figures are different. Phew!

Since then the version with the buggy results was retracted from the journal, and hopefully, the updated version will be published soon. In the meantime the arXiv version is up-to-date.

## Another take on the Hyvärinen score for model comparison Exact log-Bayes factors (log-BF) and H-factors (HF) of M1 against M2, computed for 100 independent samples (thin solid lines) of 1000 observations generated as i.i.d. N(1,1), under three increasingly vague priors for θ1.

In a former post, Pierre wrote about Bayesian model comparison and the limitations of Bayes factors in the presence of vague priors. Here we are, one year later, and I am happy to announce that our joint work with Jie Ding and Vahid Tarokh has been recently accepted for publication. As way of celebrating, allow me to give you another take on the matter.

Continue reading “Another take on the Hyvärinen score for model comparison”

## Final update on unbiased smoothing Two coupled chains, marginally following a conditional particle filter algorithm with ancestor sampling, and meeting in 20 iterations. Script here.

Hi,

Two years ago I blogged about couplings of conditional particle filters for smoothing.  The paper with Fredrik Lindsten and Thomas Schön has just been accepted for publication at JASA, and the arXiv version and github repository are hopefully in their final forms. Here I’ll mention a few recent developments and follow-up articles by other researchers.

Continue reading “Final update on unbiased smoothing”

## Couplings of Normal variables Hi,

Just to play a bit with the gganimate package, and to celebrate National Coupling Day, the above plot shows different couplings of two univariate Normal distributions, Normal(0,1) and Normal(2,1). That is, each point is a pair (x,y) where x follows a Normal(0,1) and y follows a Normal(2,1). Below I’ll recall briefly how each coupling operates, in the Normal case. The code is available at the end of the post.

Continue reading “Couplings of Normal variables”

## Different ways of using MCMC algorithms Hi,

This post is about different ways of using Markov chain Monte Carlo (MCMC) algorithms for numerical integration or sampling. It can be a hard job to design an MCMC algorithm for a given target distribution. Once it’s finally implemented, it gives a way of sampling a new point X’ given an existing point X. From there, the algorithm can be used in various ways to construct estimators of integrals/distribution of interest. Some ways are more amenable to parallel computing than others. I give some examples with references below.

Continue reading “Different ways of using MCMC algorithms”