Statisfaction

Estimating convergence of Markov chains

Posted in Statistics by Pierre Jacob on 5 June 2019
2019-06-assessing-3

Upper bounds as never seen on TV.

Hi all,

Niloy Biswas (PhD student at Harvard) and I have recently arXived a manuscript on the assessment of MCMC convergence (using couplings!). Here I’ll describe the main result, and some experiments (that are not in the current version of the paper) revisiting a 1996 paper by Mary Kathryn Cowles and Jeff Rosenthal entitled “A simulation approach to convergence rates for Markov chain Monte Carlo algorithms“. Code in R producing the figures of this post is available here.

(more…)

Advertisements

particles

Posted in General, Statistics by nicolaschopin on 4 June 2019

I have released a few months ago a Python package for particle filtering, called particles; you can find it on Github here. You may want to have a look first at the documentation, in particular the tutorials here.

This package has been developed to support our (with Omiros Papaspiliopoulos) forthcoming book called (tentatively): an introduction to Sequential Monte Carlo. It implements all the algorithms discussed in the book; e.g.

  • bootstrap, guided and auxiliary particle filters
  • all standard resampling schemes
  • most particle smoothing algorithms
  • sequential quasi-Monte Carlo
  • PMCMC (PMMH, Particle Gibbs), SMC^2
  • SMC samplers

It also contains all the scripts that were used to perform the numerical experiments discussed in the book.

A random plot taken from the forthcoming book. Can you guess what it represents exactly?

This package is hopefully useful to people with different expectations and level of expertise. For instance, if you just want to run a particle filter for a basic state-space model, you may describe that model as follows:

import particles
from particles import state_space_models as ssm

class ToySSM(ssm.StateSpaceModel):
    def PX0(self):  # Distribution of X_0 
        return dists.Normal()  # X_0 ~ N(0, 1)
    def PX(self, t, xp):  # Distribution of X_t given X_{t-1}
        return dists.Normal(loc=xp)  # X_t ~ N( X_{t-1}, 1)
    def PY(self, t, xp, x):  # Distribution of Y_t given X_t (and X_{t-1}) 
        return dists.Normal(loc=x, scale=self.sigma)  # Y_t ~ N(X_t, sigma^2)

And then simulate data, and run the corresponding bootstrap filter, as follows:

my_model = ToySSM(sigma=0.2)
x, y = my_model.simulate(200)  # sample size is 200

alg = particles.SMC(fk=ssm.Bootstrap(ssm=my_model, data=y), N=200)
alg.run()

On the other hand, if you are an SMC expert, you may re-use only the parts you need; e.g. a resampling scheme:

from particles import resampling 

A = resampling.systematic(W)

Up to now, this package has been tested mostly by my PhD students, and the students of my M2 course on particle filtering at the ENSAE; many thanks to all of them. Since no computer screen has been smashed in the process, I guess I can publicize it a bit more. Please let me know if you have any questions, comments, or feature request. (You may report a bug by raising an issue on the Github page.)

Based on your feedback, I’m planning to write a few more posts in the coming weeks about particles and more generally numerical computation in Python. Stay tuned!

Budget constrained simulations

Posted in Statistics by Pierre Jacob on 5 February 2019

2019-01-unbiasedsimulations

You’ll have to read the post to get what these lines are about.

Hi all,

This post is about some results from “Bias Properties of Budget Constrained Simulations“, by Glynn & Heidelberger and published in Operations Research in 1990. I have found these results extremely useful, and our latest manuscript on unbiased MCMC recalls them in detail. Below I go through some of the results and describe the simulations that lead to the above figure.

(more…)

Another take on the Hyvärinen score for model comparison

Posted in Statistics by Stephane Shao on 22 September 2018

example_iidNormal_vague_15_by_7

Exact log-Bayes factors (log-BF) and H-factors (HF) of M1 against M2, computed for 100 independent samples (thin solid lines) of 1000 observations generated as i.i.d. N(1,1), under three increasingly vague priors for θ1.

In a former post, Pierre wrote about Bayesian model comparison and the limitations of Bayes factors in the presence of vague priors. Here we are, one year later, and I am happy to announce that our joint work with Jie Ding and Vahid Tarokh has been recently accepted for publication. As way of celebrating, allow me to give you another take on the matter.

(more…)

Final update on unbiased smoothing

Posted in Statistics by Pierre Jacob on 27 August 2018

2018-08-coupledcpf

Two coupled chains, marginally following a conditional particle filter algorithm with ancestor sampling, and meeting in 20 iterations. Script here.

Hi,

Two years ago I blogged about couplings of conditional particle filters for smoothing.  The paper with Fredrik Lindsten and Thomas Schön has just been accepted for publication at JASA, and the arXiv version and github repository are hopefully in their final forms. Here I’ll mention a few recent developments and follow-up articles by other researchers.

(more…)

Couplings of Normal variables

Posted in R, Statistics by Pierre Jacob on 24 August 2018

2018-gganimate-coupling

Hi,

Just to play a bit with the gganimate package, and to celebrate National Coupling Day, the above plot shows different couplings of two univariate Normal distributions, Normal(0,1) and Normal(2,1). That is, each point is a pair (x,y) where x follows a Normal(0,1) and y follows a Normal(2,1). Below I’ll recall briefly how each coupling operates, in the Normal case. The code is available at the end of the post.

(more…)

Different ways of using MCMC algorithms

Posted in General, Statistics by Pierre Jacob on 19 July 2018

2018-07-diagramasymptotics-mcmc

Hi,

This post is about different ways of using Markov chain Monte Carlo (MCMC) algorithms for numerical integration or sampling. It can be a hard job to design an MCMC algorithm for a given target distribution. Once it’s finally implemented, it gives a way of sampling a new point X’ given an existing point X. From there, the algorithm can be used in various ways to construct estimators of integrals/distribution of interest. Some ways are more amenable to parallel computing than others. I give some examples with references below.

(more…)

Scaling of MCMC with dimension (experiments)

Posted in Statistics by Pierre Jacob on 15 May 2018

2018-05-Squarecubetesseract

Taken from Wikipedia’s article on dimension.

Hi all,

In this post, I’ll go through numerical experiments illustrating the scaling of some MCMC algorithms with respect to the dimension. I will focus on a simple setting, to illustrate some theoretical results developed by Gareth Roberts, Andrew Stuart, Alexandros Beskos, Jeff Rosenthal and many of their co-authors over many years, for instance here for random walk Metropolis-Hastings (RWMH, and see here more recently), here for Modified Adjusted Langevin Algorithm (MALA), here for Hamiltonian Monte Carlo (HMC).

(more…)

A big problem in our community

Posted in General, Seminar/Conference, Statistics by Pierre Jacob on 14 December 2017

5edacba28521a18b3b6ad0d53a7622b7

“Tout va très bien”, meaning “all is well”, by Franquin.

Hi all,

Kristian Lum, who was already one of my Statistics superheroes for her many interesting papers and great talks, bravely wrote the following text about her experience as a young statistician going to conferences:

https://medium.com/@kristianlum/statistics-we-have-a-problem-304638dc5de5

I can’t thank Kristian enough for speaking out. Her experience is both shocking and hardly surprising. Many, many academics report similar stories. This simply can’t go on like that.

I happen to have gone to the conferences mentioned by Kristian, and my experience as a young man was completely different. It was all about meeting interesting people, discussing ideas, being challenged, and having good times. Nobody harassed, touched or assaulted me. There was some flirting, as I guess is natural when hundreds of people are put in sunny places far away from home, but I was never the victim of any misconduct or abuse of power. So instead of driving me out of the field, conferences became important, enriching and rewarding moments of my professional life.

Looking back at those conferences I feel sick, and heartbroken, at the thought that some of my peers were having such a difficult time, because of predators who don’t ever  face the consequences of their actions. Meanwhile I was part of the silent majority.

The recent series of revelations about sexual harassment and assaults in other professional environments indicate that this is not specific to our field, nor to academia. But this does not make it any more acceptable. I know for a fact that many leaders of our field take this issue extremely seriously (as Kristian mentions too),  but clearly much much more needs to be done. The current situation is just shameful; strong and  coordinated actions will be needed to fix it. Thanks again to Kristian for the wake-up call.

Bayesian model comparison with vague or improper priors

Posted in Statistics by Pierre Jacob on 6 November 2017

 

example_SV_15_by_9

Synthetic data from a Lévy-driven stochastic volatility model (top), log-Bayes factor between two such models (middle) and “Hyvärinen factor” (proposed approach, bottom). Each line represents a different Monte Carlo estimate, obtained sequentially over time.

Hi,

With Stephane Shao, Jie Ding and Vahid Tarokh we have just arXived a tech report entitled “Bayesian model comparison with the Hyvärinen score: computation and consistency“. Here I’ll explain the context, that is, scoring rules and Hyvärinen scores (originating in Hyvärinen’s score matching approach to inference), and then what we actually do in the paper.

(more…)

%d bloggers like this: