A very gentle introduction to e-values

8 minute read

Published: January 24, 2026

Preliminaries

This blog post is intended to provide some intuition behind e-values, a framework for hypothesis testing that has gained traction in the statistics and ML community over the last few years. The goal of this post is to briefly communicate what e-values are and how they’re connected to testing. I assume familiarity with statistics and the basic concepts of hypothesis testing, so be sure to brush up on those before reading. I will also use Markov’s inequality, which states that for any non-negative random variable $Z$ and $z>0$,

\[P(Z \geq z) \leq \frac{\mathbb E[Z]}{z}.\]

Introduction

The null hypothesis is some state of nature that you are interested in testing. For example, in a t-test in regression, the null hypothesis is $H_0 : \beta_j = 0$, meaning that the $j$th variable has no impact on the mean of the outcome. We then construct a test statistic, in this case, $Z_j = \hat \beta_j/{ \hat \sigma \sqrt{v_j} }$, where $v_j$ is the $j$th entry along the diagonal of $(X^T X)^{-1}$. The idea behind this test statistic is that, if $H_0$ were true, then its magnitude would be small. More formally, under $H_0,$ $Z_j \sim t_{n-p-1}$, or a t-distribution with $n-p-1$ degrees of freedom, which is a symmetric distribution about zero. If $T_j=|Z_j|$ is large, then clearly this is evidence against the null, particularly for large $n$.

In practice, we reject the null when $T_j$ is greater than or equal to $k$, which is the $1-\alpha/2$ quantile of a $t_{n-p-1}$ distribution, where $\alpha \in (0,1)$. Why this quantile specifically? This is because we know the distribution of $Z_j$ under the null exactly, and so the test $\phi = \textbf{1}(T_j>k)$ is a level-$\alpha$ test, i.e. $\mathbb E^0[\phi] = \alpha$, where that expectation is taken with respect to the distribution of the data when $\beta_j=0$. In general, a level-$\alpha$ test $\phi$ satisfies $\mathbb E^P[\phi] \leq \alpha$ for all distributions $P$ covered by the null hypothesis.

For many models, we may not know the distribution of the test statistic when $H_0$ is true. In that case, how can we design a level-$\alpha$ test (without replying on an asymptotic approximation)?

E-Variables

More generally, say that the data are $Y=(Y_1, \dots, Y_n)$, and I have access to some nonnegative test statistic $E=E(Y)$. The notation here reflects that $E(Y)$ is a function of $Y$, and all probability statements below are meant to be expressed in terms of the distribution of $Y$. Say that I have a simple null hypothesis $H_0$, which just means that there is a single distribution $P$ for $Y$ that corresponds to $H_0$, which has pdf/pmf $p$.

Now, unlike the regression context, say that I do not know the distribution of $E$ if $Y \sim P$. However, I do know the following:

\[\mathbb E^P[E(Y)] = \int E(y) p(y) \textrm{d} y \leq 1.\]

If the above holds, then we call $E(Y)$ an e-variable, and $E(y)$, or the value that $E(Y)$ takes after observing the data, is an e-value. Now, consider the test $\phi = \textbf{1}(E(Y) \geq 1/\alpha)$. Like our earlier example, this test rejects $H_0$ when a test statistic, $E(Y)$, is large, but the cut-off is not some quantile, instead, it is a universal cut-off that does not depend on $P$.

But is $\phi$ a level-$\alpha$ test? The answer is yes, and this is because of Markov’s inequality. Since $\mathbb E^P[E(Y)] \leq 1$,

\[\mathbb E^P[\phi] = P(E(Y) \geq 1/\alpha) \leq \frac{\mathbb E^P[E(Y)]}{1/\alpha} \leq \alpha \cdot 1 = \alpha.\]

So, a level-$\alpha$ test can always be performed if we have access to an e-variable. The only knowledge we need of $E(Y)$ when $Y \sim P$ is that its expected value is no more than $1$. If $H_0$ were composite (i.e. multiple distributions for $Y$ would correspond to that state of nature), then we need $\mathbb E^P[E(Y)] \leq 1$ for all distributions $P$ in $H_0$.

Relationship to P-Values

Markov’s inequality gives us a way to create a level-$\alpha$ test. There may be something about the decision rule that looks familiar. For simplicity, assume that $E(Y)>0$ almost surely for any $P$ in the null hypothesis. If we reject when $E(Y) \geq 1/\alpha$, this is equivalent to rejecting when $1/E(Y) \leq \alpha$. This is the same decision rule as a p-value; we reject the null hypothesis when a p-value is less than or equal to $\alpha$. We can go one-step farther: this shows that $1/E(Y)$ is a p-variable, or the random realization of a p-value (we’d have that $1/E(y)$ is a p-value here). Formally,

$P(E(Y) \geq 1/\alpha) \leq \alpha \iff P(1/E(Y) \leq \alpha) \leq \alpha$.

Likelihood Ratios

A key example of an e-variable is a likelihood ratio. For now, let’s assume that $Y \in \mathbb R^n$ and $p(y) >0$ for any $y \in \mathbb R^n$. Returning to the simple hypothesis example of above, let $Q$ be any distribution with density $q$ (which also has full support on $\mathbb R^n$). A likelihood ratio is simply

\[E(Y) = \frac{q(y)}{p(y)}.\]

Then $E(Y)$ is an e-variable, since

\[\mathbb E^P[E(Y)] = \int \frac{q(y)}{p(y)} p(y) \textrm{d} y = \int q(y) \textrm{d} y = 1,\]

and so $\phi = \textbf{1}(E(Y) \geq 1/\alpha)$ is a level-$\alpha$ test. The choice of $Q$ depends on the alternative hypothesis $H_1$ (and is related to the e-power of $E(Y)$), but note that the e-variable property holds for any $Q$ with full support.

What do we do if $H_0$ has more than one distribution included in it? Suppose now that we model $Y$ with a parametric family, say ${P_\theta: \theta \in \Theta_0}$ for some parameter space $\Theta_0$. Let $\hat \theta_0$ be the maximum likelihood estimator for $\theta$ over $\Theta_0$. $\hat \theta_0$ maximizing the likelihood over the null is paramount for validity. Next, let

\[E(Y) = \frac{q(y)}{p_{\hat \theta_0}(y)}.\]

The denominator is now the maximized likelihood over $\Theta_0$. It turns out that $E(Y)$ is also an e-variable. This is because, for any $\theta \in \Theta_0$, $p_\theta(y) \leq p_{\hat \theta_0}(y)$. In turn, we have that

\[E(Y) = \frac{q(y)}{p_{\hat \theta_0}(y)} \leq \frac{q(y)}{p_{\theta}(y)},\]

implying,

\[\mathbb E^{P_\theta}[E(Y)] \leq \int \frac{q(y)}{p_{\theta}(y)} p_\theta(y) \textrm{d}y = \int q(y) \textrm{d} y = 1.\]

Hence, if we can maximize the likelihood over the null, then we can specify an e-variable via a likelihood ratio.

Combining E-Variables

If we observe an iid sample $Y_1, \dots, Y_n$ and compute e-variables $E(Y_1), \dots, E(Y_n)$, it is simple to combine these e-variables into a single e-variable. If $\mathcal P$ is the set of distributions covered by the null hypothesis, by definition $\mathbb E^P[Y_i] \leq 1$ for any $P \in \mathcal P$. It is useful to combine a list of e-variables into a single e-variable, say, $E_n(Y_1, \dots, Y_n)$, because we can construct a level-$\alpha$ test for the null by rejecting when $E_n(Y_1, \dots, Y_n) \geq 1/\alpha$. First, the average of the e-variables is an e-variable.

\[\mathbb E^P \bigg [ \frac{1}{n} \sum_{i=1}^n E(Y_i) \bigg] = \frac{1}{n} \sum_{i=1}^n \mathbb E^P[E(Y_i)] \leq \frac{1}{n} \sum_{i=1}^n = \frac{n}{n} = 1.\]

In fact, any convex combination of e-variables is also an e-variable using the same derivation. We don’t technically need that $Y_1, \dots, Y_n$ are independent for the e-variable criterion to hold. If the null hypothesis also assumes that $Y_1, \dots, Y_n$ are independent, the product is an e-variable.

\[\mathbb E^P \bigg [ \prod_{i=1}^n E(Y_i) \bigg] = \prod_{i=1}^n \mathbb E^P[E(Y_i)] \leq \prod_{i=1}^n = 1.\]

More generally, $\prod_{i=1}^n ( 1 - \lambda_i + \lambda_i E(Y_i) )$ is an e-variable, where $\lambda_i \in (0,1)$ are constants. In summary, we can always combine e-variables using convex combinations and, if we also assume independence, weighted products.

Alexander Dombowsky, PhD