In Bayesian nonparametrics, many models address the problem of density regression, including covariate dependent processes. These were settled by the pioneering works by [current ISBA president] MacEachern (1999) who introduced the general class of dependent Dirichlet processes. The literature on dependent processes was developed in numerous models, such as nonparametric regression, time series data, meta-analysis, to cite but a few, and applied to a wealth of fields such as, e.g., epidemiology, bioassay problems, genomics, finance. For references, see for instance the chapter by David Dunson in the Bayesian nonparametrics textbook (edited in 2010 by Nils Lid Hjort, Chris Holmes, Peter Müller and Stephen G. Walker). With Kerrie Mengersen and Judith Rousseau, we have proposed a dependent model in the same vein for modeling the influence of fuel spills on species diversity (arxiv).
In our ecological example, the model provides a series of densities on the Y axis (in our case, posterior density of species diversity), indexed by some covariate X (a pollutant). See file density_plot.txt. The following Plotly R code
library(plotly) mydata = read.csv("density_plot.txt") df = as.data.frame(mydata) plot_ly(df, x = Y, y = X, z = Z, group = X, type = "scatter3d", mode = "lines")
provides a graph as below. For the interactive version, see the RPubs page here.
A promise is a promise, here is a post on the so-called Dirichlet process, or DP.
What is it? a stochastic process, whose realizations are probability measures. So it is a distribution on distributions. A nice feature for nonparametric Bayesians, who can thus use the DP as a prior when the parameter itself is a probability measure.
As mentionned in an earlier post, a foundational paper and still a nice reading today, which introduced the DP, is Ferguson, T.S. (1973), A Bayesian Analysis of Some Nonparametric Problems. The Annals of Statistics, 1, 209-230. I will not go in very deep details here, but mainly will stress the discreteness of the DP.
First, a DP, say on , has two parameters, a precision parameter , and a base measure on . Basically, is the mean of the process, and measures the inverse of its variance. Formally, we write for a value of the DP. Then, for all measurable subset of , , and . Actually, a more acurate statement says that .
A realization is almost surely discrete. In other words, it is a mixture of Dirac masses. Let us explain this explicit expression as a countable mixture, due to Sethuraman (1994). Let , and , mutually independent. Define , and . Then writes . This is called the Sethuraman representation, also refered to as “stick-breaking”. The reason for the name is in the definition of the weights : each can be seen as the length of a part of a stick of unit lenght, broken in infinitely many sticks. The first stick is of length . The remaining part has length , and is broken at of its length, which defines a second stick of length . And so forth. We see easily that this builds a sequence of s that sum to 1, because the remaining part at step has length , which goes to 0 almost surely.
Now let us illustrate this with the nice plots of Eric Barrat. He chooses a standard normal for , which is quite usual, and . A way to get a graphical view of a realization is to represent a Dirac mass by its weight:
To quote Valencia discussant favorite adjectives, the three talks were terrific and thought-provoking.
Eric Moulines presented Multiple Try Methods in MCMC algorithms. Instead of a unique proposal at each step, one proposes several values, among which only one is kept. The proposals can be chosen independant, but introducing dependence in the right way speeds up the rate of convergence to the stationary distribution. An interesting feature of this algorithm, espacially for Pierre, is that it allows parallel computation (in multiple propositions) whereas the standard Metropolis-Hastings algorithm is essentially sequential. See as well Pierre, Christian and Murray Smith’s block Independent Metropolis-Hastings algorithm for further details.
Jean-Marc Bardet introduced a way to detect ruptures in time series. He focuses on causal time series, ie they can be written only in terms of present and past innovations, for example . A rupture at time t means the parameters change at t.
The must-see talk for me was Eric Barat presentation on BNP modeling for sapce-time emission tomography. For new comer, BNP means more than a bank: Bayesian nonparametric. It is nice to see a very efficient application of BNP methods to a medical field. Eric kindly gives his slides (cf below) which I recommend, espacially the section on random probability measures: he reviews properties of the Dirichlet process, various representations (Chinese restaurant, Stick-breaking), and extends to the Pitman-Yor process and Pitman-Yor mixture. Then he gives posterior simulations by Gibbs sampling. I am interested in dependent over time models, and I am thankful for Eric for his pointer to a recent article of Chung and Dunson on local Dirichlet process, a nifty and simple construction of a Dependent Dirichlet process.
In a few days, I will try to make clear what the Dirichlet process is!
My contribution this year to MCB seminar at CREST is about nonparametric Bayes (today at 2 pm, room 14). I shall start with 1) a few words of history, then introduce 2) the Dirichlet Process by several of its numerous defining properties. I will next introduce an extension of the Dirichlet Process, namely 3) the DP mixtures, useful for applications like 4) density estimation. Last, I will show 5) posterior MCMC simulations for a density model and give some 6) reference textbooks.
Ferguson, T.S. (1973), A Bayesian Analysis of Some Nonparametric Problems. The Annals of Statistics, 1, 209-230.
Doksum, K. (1974). Tailfree and neutral random probabilities and their posterior distributions. The Annals of Probability, 2 183-201.
Blackwell, D. (1973). Discreteness of Ferguson selections. The Annals of Statistics, 1 356-358.
Blackwell, D. and MacQueen, J. (1973). Ferguson distributions via Polya urn schemes. The Annals of Statistics, 1 353-355.
2) Dirichlet Process defining properties
Mainly based on Peter Müller’s slides of class 2 at Santa Cruz this summer.
Dirichlet distribution on finite partitions
Stick-breaking/ Sethuraman representation
Polya urn analogy for the predictive probability function
Normalization of a Gamma Process
3) Dirichlet Process Mixtures (DPM)
Convolution of a DP with a continuous kernel to circumvent its discretness.
4) Density estimation with DPMs
5) Posterior MCMC simulation
Based on Peter Müller’s slides of class 6.
Bayesian nonparametrics, 2003, J. K. Ghosh, R. V. Ramamoorthi.
Bayesian nonparametrics, 2010, Nils Lid Hjort, Chris Holmes, Peter Müller, Stephen G. Walker, and contributions by Subhashis Ghosal, Antonio Lijoi, Igor Prünster, Yee Whye Teh, Michael I. Jordan, Jim Griffin, David B. Dunson, Fernando Quintana.
• Dirichlet processes and the Chinese restaurant process
• Dirichlet process mixture models
• Polya trees
• Dependent Dirichlet processes
• Species sampling models
• Product partition models
• Beta processes and the Indian buffet process
• Computational tools