Do you really need to work 60+ hours per week to be a good researcher ?

Posted in General, publishing by JB Salomond on 6 February 2018


There is a controversy these days on social media about academics claiming that “if you do not feel like working 60+ hours per week including weekends and evenings you should probably find another job”. This is utterly frustrating. As an academic myself, I clearly do not feel like working days and nights, even if I am neck deep in a project. Does it makes me a poor assistant prof ?

Capture d_écran 2018-02-06 à 17.21.35

It is no secret that academic research is not, in general, a 9 to 5 job (not saying that it cannot be). I myself usually work on weekends, when commuting or during holidays. I always carry a few papers that I did not had time to read or relentlessly write equations in my notebook when an idea strikes me during the morning commute. I do think about how I could improve my next day lecture and make some changes in my slides late in the evening. That is partly because I am disorganised, also because the job sort of requires it.  We lack time to do all the things we want to do while at the lab. Conferences, seminars and meetings wreck your schedule if you had any. So you might end up seeing any time off as a lost opportunity to do more research.

This situation is clearly not good, and many academics, including me, have or had a hard time dealing with this. In particular when you are still a PhD/postdoc/tenure track (you name it) and need to stick out of the pack to get a position in an extremely competitive environment. And when senior full professors are telling you that you need to work even harder if you want to be considered as worthy, that is clearly not helping.

My view is that even if we all want to shoot for the moon (and hit the damn thing), it is totally fine to simply do some good research. Each of your paper does not have to be ground breaking, as long as it contribute to the field, it should be good enough and acknowledge as such. If your goal is to have that-many JRSS/Biometrika/Annals of Stats papers and a sky-rocketing h-index no matter what, you are probably doing it wrong. Competition between academics can be a good boost from time to time, but it should not be the end of it. More importantly, it should not be what drives an academic careerer. The negative externalities of this system are depressed junior researchers and limited scientific research squeezed out of our brains to extend our publication record.

So what should we do about this ? Well first, stop bragging about how many hours to work per week, if you feel like working 24/7, good for you, but it does not have to be that way for everyone. Secondly, stop judging academics (especially junior ones) on some dumb metrics such as h-index or so, if you need to evaluate a candidate, read their research. In short, cut the competition and be supportive. Make academia fun again !


Sub-Gaussian property for the Beta distribution (part 3, final)

Posted in General, R by Julyan Arbel on 26 December 2017

When a Beta random variable wants to act like a Bernoulli: convergence of optimal proxy variance.

In this third and last post about the Sub-Gaussian property for the Beta distribution [1] (post 1 and post 2), I would like to show the interplay with the Bernoulli distribution as well as some connexions with optimal transport (OT is a hot topic in general, and also on this blog with Pierre’s posts on Wasserstein ABC). (more…)

A big problem in our community

Posted in General, Seminar/Conference, Statistics by Pierre Jacob on 14 December 2017

“Tout va très bien”, meaning “all is well”, by Franquin.

Hi all,

Kristian Lum, who was already one of my Statistics superheroes for her many interesting papers and great talks, bravely wrote the following text about her experience as a young statistician going to conferences:

I can’t thank Kristian enough for speaking out. Her experience is both shocking and hardly surprising. Many, many academics report similar stories. This simply can’t go on like that.

I happen to have gone to the conferences mentioned by Kristian, and my experience as a young man was completely different. It was all about meeting interesting people, discussing ideas, being challenged, and having good times. Nobody harassed, touched or assaulted me. There was some flirting, as I guess is natural when hundreds of people are put in sunny places far away from home, but I was never the victim of any misconduct or abuse of power. So instead of driving me out of the field, conferences became important, enriching and rewarding moments of my professional life.

Looking back at those conferences I feel sick, and heartbroken, at the thought that some of my peers were having such a difficult time, because of predators who don’t ever  face the consequences of their actions. Meanwhile I was part of the silent majority.

The recent series of revelations about sexual harassment and assaults in other professional environments indicate that this is not specific to our field, nor to academia. But this does not make it any more acceptable. I know for a fact that many leaders of our field take this issue extremely seriously (as Kristian mentions too),  but clearly much much more needs to be done. The current situation is just shameful; strong and  coordinated actions will be needed to fix it. Thanks again to Kristian for the wake-up call.

Dynamic publication list for research webpage using arXiv, HAL, or bibtex2html

Posted in General by Julyan Arbel on 23 October 2017

Well of course, dynamic is conditional upon some manual feeding. If you put your papers on arXiv or HAL, then those two propose dynamic widgets. If you maintain a .bib file of your papers, you can use tools like bibtex2html. This is not dynamic at all, but it allows for finer tuning of url links you might want to add than with arXiv or HAL options. I review below those three options. (more…)

ISBA elections, let’s go voting

Posted in General by Julyan Arbel on 16 October 2017

So it seems even Thomas B. went voting.

The International Society for Bayesian Analysis (ISBA), is running elections until November, 15. This year, two contributors on this blog, Nicolas Chopin and myself, are running for an ISBA Section office. The sections of the society, nine in number as of today, gather researchers with common research interests: Computation, Objective Bayes, Nonparametrics, etc.

Here are our candidate statements:


Tagged with: ,

Statistical learning in models made of modules

Posted in General, Statistics by Pierre Jacob on 9 September 2017



Graph of variables in a model made of two modules: the first with parameter theta1 and data Y1, and the second with parameter theta2 and data Y2, defined conditionally upon theta1.



With Lawrence Murray, Chris Holmes and Christian Robert, we have recently arXived a paper entitled “Better together? Statistical learning in models made of modules”. Christian blogged about it already. The context is the following: parameters of a first model appear as inputs in another model. The question is whether to consider a “joint model approach”, where all parameters are estimated simultaneously with all of the data. Or if one should instead follow a “modular approach”, where the first parameters are estimated with the first model only, ignoring the second model. Examples of modular approaches include the “cut distribution“, or “two-step estimators” (e.g. Chapter 6 of Newey & McFadden (1994)). In many fields, modular approaches are preferred, because the second model is suspected of being more misspecified than the first one. Misspecification of the second model can “contaminate” the joint model, with dire consequences on inference, as described e.g. in Bayarri, Berger & Liu (2009). Other reasons include computational constraints and the lack of simultaneous availability of all models and associated data. In the paper, we try to make sense of the defects of the joint model approach and we propose a principled, quantitative way of choosing between joint and modular approaches.


School of Statistics for Astrophysics, Autrans, France, October 9-13

Posted in General by Julyan Arbel on 7 September 2017

Didier Fraix-Burnet (IPAG), Stéphane Girard (Inria) and myself are organising a School of Statistics for Astrophysics, Stat4Astro, to be held in October in France. The primary goal of the School is to train astronomers to the use of modern statistical techniques. It also aims at bridging the gap between the two communities by emphasising on the practice during works in common, to give firm grounds to the theoretical lessons, and to initiate works on problems brought by the participants. There have been two previous sessions of this school, one on regression and one on clustering. The speakers of this edition, including Christian Robert, Roberto Trotta and David van Dyk, will focus on the Bayesian methodology, with the moral support of the Bayesian Society, ISBA. The interest of this statistical approach in astrophysics probably comes from its necessity and its success in determining the cosmological parameters from observations, especially from the cosmic background fluctuations.  The cosmological community has thus been very active in this field (see for instance the Cosmostatistics Initiative COIN).

But the Bayesian methodology, complementary to the more classical frequentist one, has many applications in physics in general due to its faculty to incorporate a priori knowledge into the inference computation, such as the uncertainties brought by the observational processes.

As for sophisticated statistical techniques, astronomers are not familiar with Bayesian methodology in general, while it is becoming more and more widespread and useful in the literature. This school will form the participants to both a strong theoretical background and a solid practice of Bayesian inference:

  • Introduction to R and Bayesian Statistics (Didier Fraix-Burnet, Institut de Planétologie et d’Astrophysique de Grenoble)
  • Foundations of Bayesian Inference (David van Dyk, Imperial College London)
  • Markov chain Monte Carlo (David van Dyk, Imperial College London)
  • Model Building (David van Dyk, Imperial College London)
  • Nested Sampling, Model Selection, and Bayesian Hierarchical Models (Roberto Trotta, Imperial College London)
  • Approximate Bayesian Computation (Christian Robert, Univ. Paris-Dauphine, Univ. Warwick and Xi’an (!))
  • Bayesian Nonparametric Approaches to Clustering (Julyan Arbel, Université Grenoble Alpes and Inria)

Feel free to register, we are not fully booked yet!


Particle methods in Statistics

Posted in General, Statistics by Pierre Jacob on 30 June 2017
Welding it together

A statistician sampling from a posterior distribution with particle methods

Hi there,

In this post, just in time for the summer, I propose a reading list for people interested in discovering the fascinating world of particle methods, aka sequential Monte Carlo methods, and their use in statistics. I also take the opportunity to advertise the SMC workshop in Uppsala (30 Aug – 1 Sept), which features an amazing list of speakers, including my postdoctoral collaborator Jeremy Heng:


Sub-Gaussian property for the Beta distribution (part 1)

Posted in General by Julyan Arbel on 2 May 2017


With my friend Olivier Marchal (mathematician, not filmmaker, nor the cop), we have just arXived a note on the sub-Gaussianity of the Beta and Dirichlet distributions.

The notion, introduced by Jean-Pierre Kahane, is as follows:

A random variable X with finite mean \mu=\mathbb{E}[X] is sub-Gaussian if there is a positive number \sigma such that:

\mathbb{E}[\exp(\lambda (X-\mu))]\le\exp\left(\frac{\lambda^2\sigma^2}{2}\right)\,\,\text{for all } \lambda\in\mathbb{R}.

Such a constant \sigma^2 is called a proxy variance, and we say that X is \sigma^2-sub-Gaussian. If X is sub-Gaussian, one is usually interested in the optimal proxy variance:

 \sigma_{\text{opt}}^2(X)=\min\{\sigma^2\geq 0\text{ such that } X \text{ is } \sigma^2\text{-sub-Gaussian}\}.

Note that the variance always gives a lower bound on the optimal proxy variance: \text{Var}[X]\leq \sigma_{\text{opt}}^2(X). In particular, when \sigma_{\text{opt}}^2(X)=\text{Var}[X], X is said to be strictly sub-Gaussian.

The sub-Gaussian property is closely related to the tails of the distribution. Intuitively, being sub-Gaussian amounts to having tails lighter than a Gaussian. This is actually a characterization of the property. Let Z\sim\mathcal{N}(0,1). Then:

X \text{ is sub-Gaussian } \iff \exists c, \forall x\geq0:\, \mathsf{P}(|X-\mathbb{E}[X]|\geq x) \leq c\mathsf{P}(|Z|\geq x).

That equivalence clearly implies exponential upper bounds for the tails of the distribution since a Gaussian Z\sim\mathcal{N}(0,\sigma^2) satisfies

\mathsf{P}(Z\ge x)\le\exp(-\frac{x^2}{2\sigma^2}).

That can also be seen directly: for a \sigma^2-sub-Gaussian variable X,

\forall\, \lambda>0\,:\,\,\mathsf{P}(X-\mu\geq x) = \mathsf{P}(e^{\lambda(X-\mu)}\geq e^{\lambda x})\leq \frac{\mathbb{E}[e^{\lambda(X-\mu)}]}{e^{\lambda x}}\quad\text{by Markov inequality,}

\leq\exp(\frac{\sigma^2\lambda^2}{2}-\lambda x)\quad\text{by sub-Gaussianity.}

The polynomial function \lambda\mapsto \frac{\sigma^2\lambda^2}{2}-\lambda x is minimized on \mathbb{R}_+ at \lambda = \frac{x}{\sigma^2}, for which we obtain

\mathsf{P}(X-\mu\geq x) \leq\exp(-\frac{x^2}{2\sigma^2}).

In that sense, the sub-Gaussian property of any compactly supported random variable X comes for free since in that case the tails are obviously lighter than those of a Gaussian. A simple general proxy variance is given by Hoeffding’s lemma. Let X be supported on [a,b] with \mathbb{E}[X]=0. Then for any \lambda\in\mathbb{R},

\mathbb{E}[\exp(\lambda X)]\leq\exp\left(\frac{(b-a)^2}{8}\lambda^2\right)

so X is \frac{(b-a)^2}{4}-sub-Gaussian.

Back to the Beta where [a,b]=[0,1], this shows the Beta is \frac{1}{4}-sub-Gaussian. The question of finding the optimal proxy variance is a more challenging issue. In addition to characterizing the optimal proxy variance of the Beta distribution in the note, we provide the simple upper bound \frac{1}{4(\alpha+\beta+1)}. It matches with Hoeffding’s bound for the extremal case \alpha\to0, \beta\to0, where the Beta random variable concentrates on the two-point set \{0,1\} (and when Hoeffding’s bound is tight).

In getting the bound \frac{1}{4(\alpha+\beta+1)}, we prove a recent conjecture made by Sam Elder in the context of Bayesian adaptive data analysis. I’ll say more about getting the optimal proxy variance in a next post soon.



Tagged with:

ABC in Banff

Posted in General, Seminar/Conference, Statistics by Pierre Jacob on 6 March 2017

Banff, also known as not the worst location for a scientific meeting.

Hi all,

Last week I attended a wonderful meeting on Approximate Bayesian Computation in Banff, which gathered a nice crowd of ABC users and enthusiasts, including lots of people outside of computational stats, whom I wouldn’t have met otherwise. Christian blogged about it there. My talk on Inference with Wasserstein distances is available as a video here (joint work with Espen Bernton, Mathieu Gerber and Christian Robert, the paper is here). In this post, I’ll summarize a few (personal) points and questions on ABC methods, after recalling the basics of ABC (ahem).


%d bloggers like this: