Hi! It’s been too long!
In a recent arXiv entry, Espen Bernton, Mathieu Gerber and Christian P. Robert and I explore the use of the Wasserstein distance to perform parameter inference in generative models. A by-product is an ABC-type approach that bypasses the choice of summary statistics. Instead, one chooses a metric on the observation space. Our work fits in the minimum distance estimation framework and is particularly related to “On minimum Kantorovich distance estimators”, by Bassetti, Bodini and Regazzini. A recent and very related paper is “Wasserstein training of restricted Boltzmann machines“, by Montavon, Müller and Cuturi, who have similar objectives but are not considering purely generative models. Similarly to that paper, we make heavy use of recent breakthroughs in numerical methods to approximate Wasserstein distances, breakthroughs which were not available to Bassetti, Bodini and Regazzini in 2006.
Here I’ll describe the main ideas in a simple setting. If you’re excited about ABC, asymptotic properties of minimum Wasserstein estimators, Hilbert space-filling curves, delay reconstructions and Takens’ theorem, or SMC samplers with r-hit kernels, check our paper!
In this post, I’ll explain the new smoother introduced in our paper Coupling of Particle Filters with Fredrik Lindsten and Thomas B. Schön from Uppsala University. Smoothing refers to the task of estimating a latent process of length , given noisy measurements of it, ; the smoothing distribution refers to . The setting is state-space models (what else?!), with a fixed parameter assumed to have been previously estimated.
In this post, I’ll write about coupling particle filters, as proposed in our recent paper with Fredrik Lindsten and Thomas B. Schön from Uppsala University, available on arXiv; and also in this paper by colleagues at NUS. The paper is about a methodology with multiple direct consequences. In this first post, I’ll focus on correlated likelihood estimators; in a later post, I’ll describe a new smoothing algorithm. Both are described in detail in the article. We’ve been blessed to have been advertised by xi’an’s og, so glory is just around the corner.
My last post dates back to May 2015… thanks to JB and Julyan for keeping the place busy! I’m not (quite) dead and intend to go back to posting stuff every now and then. And by the way, congrats to both for their new jobs!
Last July, I’ve also started a new job, as an assistant professor in the Department of Statistics at Harvard University, after having spent two years in Oxford. At some point, I might post something on the cultural difference between the
European English and American communities of statisticians.
In the coming weeks, I’ll tell you all about a new paper entitled Coupling of Particle Filters, co-written with Fredrik Lindsten and Thomas B. Schön from Uppsala University in Sweden. We are excited about this coupling idea because it’s simple and yet brings massive gains in many important aspects of inference for state space models (including both parameter inference and smoothing). I’ll be talking about it at the World Congress in Probability and Statistics in Toronto next week and at JSM in Chicago, early in August.
I’ll also try to write about another exciting project, joint work with Christian Robert, Chris Holmes and Lawrence Murray, on modularization, cutting feedback, the infamous cut function of BUGS and all that funny stuff. I’ve talked about it in ISBA 2016, and intend to put the associated tech report on arXiv over the summer.
I have just arXived a review article, written for ESAIM: Proceedings and Surveys, called Sequential Bayesian inference for implicit hidden Markov models and current limitations. The topic is sequential Bayesian estimation: you want to perform inference (say, parameter inference, or prediction of future observations), taking into account parameter and model uncertainties, using hidden Markov models. I hope that the article can be useful for some people: I have tried to stay at a general level, but there are more than 90 references if you’re interested in learning more (sorry in advance for not having cited your article on the topic!). Below I’ll comment on a few points.
This is an article intended for the ISBA bulletin, jointly written by us all at Statisfaction, Rasmus Bååth from Publishable Stuff, Boris Hejblum from Research side effects, Thiago G. Martins from tgmstat@wordpress, Ewan Cameron from Another Astrostatistics Blog and Gregory Gandenberger from gandenberger.org.
Inspired by established blogs, such as the popular Statistical Modeling, Causal Inference, and Social Science or Xi’an’s Og, each of us began blogging as a way to diarize our learning adventures, to share bits of R code or LaTeX tips, and to advertise our own papers and projects. Along the way we’ve come to a new appreciation of the world of academic blogging: a never-ending international seminar, attended by renowned scientists and anonymous users alike. Here we share our experiences by weighing the pros and cons of blogging from the point of view of young researchers.
With Alexandre Thiéry we’ve been working on non-negative unbiased estimators for a while now. Since I’ve been talking about it at conferences and since we’ve just arXived the second version of the article, it’s time for a blog post. This post is kind of a follow-up of a previous post from July, where I was commenting on Playing Russian Roulette with Intractable Likelihoods by Mark Girolami, Anne-Marie Lyne, Heiko Strathmann, Daniel Simpson, Yves Atchade.
It’s been a while I haven’t written about parallelization and GPUs. With colleagues Lawrence Murray and Anthony Lee we have just arXived a new version of Parallel resampling in the particle filter. The setting is that, on modern computing architectures such as GPUs, thousands of operations can be performed in parallel (i.e. simultaneously) and therefore the rest of the calculations that cannot be parallelized quickly becomes the bottleneck. In the case of the particle filter (or any sequential Monte Carlo method such as SMC samplers), that bottleneck is the resampling step. The article investigates this issue and numerically compares different resampling schemes.
Today I am going to introduce the moustache target distribution (moustarget distribution for brievety). Load some packages first.
library(wesanderson) # on CRAN library(RShapeTarget) # available on https://github.com/pierrejacob/RShapeTarget/ library(PAWL) # on CRAN
Let’s invoke the moustarget distribution.
shape <- create_target_from_shape( file_name=system.file(package = "RShapeTarget", "extdata/moustache.svg"), lambda=5) rinit <- function(size) matrix(rnorm(2*size), ncol = 2) moustarget <- target(name = "moustache", dimension = 2, rinit = rinit, logdensity = shape$logd, parameters = shape$algo_parameters)
This defines a target distribution represented by a SVG file using RShapeTarget. The target probability density function is defined on and is proportional to on the segments described in the SVG files, and decreases exponentially fast to away from the segments. The density function of the moustarget is plotted below, a picture being worth a thousand words.
There’s a nice exhibition open until May 26th at the British Library in London, entitled Beautiful Science: Picturing Data, Inspiring Insight. Various examples of data visualizations are shown, either historical or very modern, or even made especially for the exhibition. Definitely worth a detour if you happen to be in the area, you can see everything in 15 minutes.
In particular there are nice visualisations of historical climate data, gathered from the logbooks of the English East India company, whose ships were crossing every possible sea in the beginning of the 19th century. The logbooks contain locations and daily weather reports, handwritten by the captains themselves. Turns out the logbooks are kept at the British Library itself and some of them are on display at the exhibition. More info on that project here: oldweather.org.