## Sub-Gaussian property for the Beta distribution (part 3, final)

In this third and last post about the Sub-Gaussian property for the Beta distribution [1] (post 1 and post 2), I would like to show the interplay with the Bernoulli distribution as well as some connexions with optimal transport (OT is a hot topic in general, and also on this blog with Pierre’s posts on Wasserstein ABC). (more…)

## Sub-Gaussian property for the Beta distribution (part 2)

As a follow-up on my previous post on the sub-Gaussian property for the Beta distribution [1], I’ll give here a visual illustration of the proof.

A random variable with finite mean is sub-Gaussian if there is a positive number such that:

We focus on *X* being a Beta random variable. Its moment generating function is known as the Kummer function, or confluent hypergeometric function . So *X *is -sub-Gaussian as soon as the difference function

remains positive on . This difference function is plotted on the right panel above for parameters . In the plot, is varying from green for the variance (which is a lower bound to the optimal proxy variance) to blue for the value , a simple upper bound given by Elder (2016), [2]. The idea of the proof is simple: the optimal proxy-variance corresponds to the value of for which admits a double zero, as illustrated with the red curve (black dot). The left panel shows the curves with varying, interpolating from green for to blue for , with only one curve qualifying as the optimal proxy variance in red.

#### References

[1] Marchal and Arbel (2017), On the sub-Gaussianity of the Beta and Dirichlet distributions. Electronic Communications in Probability, 22:1–14, 2017. Code on GitHub.

[2] Elder (2016), Bayesian Adaptive Data Analysis Guarantees from Subgaussianity, https://arxiv.org/abs/1611.00065

## Sub-Gaussian property for the Beta distribution (part 1)

With my friend Olivier Marchal (mathematician, not filmmaker, nor the cop), we have just arXived a note on the sub-Gaussianity of the Beta and Dirichlet distributions.

The notion, introduced by Jean-Pierre Kahane, is as follows:

A random variable with finite mean is sub-Gaussian if there is a positive number such that:

Such a constant is called a proxy variance, and we say that is -sub-Gaussian. If is sub-Gaussian, one is usually interested in the optimal proxy variance:

Note that the variance always gives a lower bound on the optimal proxy variance: . In particular, when , is said to be

strictlysub-Gaussian.

The sub-Gaussian property is closely related to the tails of the distribution. Intuitively, being sub-Gaussian amounts to having tails lighter than a Gaussian. This is actually a characterization of the property. Let . Then:

That equivalence clearly implies exponential upper bounds for the tails of the distribution since a Gaussian satisfies

That can also be seen directly: for a -sub-Gaussian variable ,

The polynomial function is minimized on at , for which we obtain

.

In that sense, the sub-Gaussian property of any compactly supported random variable comes for free since in that case the tails are obviously lighter than those of a Gaussian. A simple general proxy variance is given by Hoeffding’s lemma. Let be supported on with . Then for any ,

so is -sub-Gaussian.

Back to the Beta where , this shows the Beta is -sub-Gaussian. The question of finding the optimal proxy variance is a more challenging issue. In addition to characterizing the optimal proxy variance of the Beta distribution in the note, we provide the simple upper bound . It matches with Hoeffding’s bound for the extremal case , , where the Beta random variable concentrates on the two-point set (and when Hoeffding’s bound is tight).

In getting the bound , we prove a recent conjecture made by Sam Elder in the context of Bayesian adaptive data analysis. I’ll say more about getting the optimal proxy variance in a next post soon.

Cheers!

Julyan

4comments