> ## Documentation Index
> Fetch the complete documentation index at: https://statsig-4b2ff144-serverless-cloudflare.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# p-Value Calculation

In Null Hypothesis Significance Tests, the p-value is the probability of observing an effect larger than or equal to the measured metric delta, under the assumption that the null hypothesis is true. In practice, a p-value that's lower than your pre-defined Type I Error threshold ($\alpha$) is treated as evidence for there being a true effect.

The methodology used for p-value calculation depends on the number of degrees of freedom ($\nu$). A two-sample z-test is appropriate for most experiments. Welch's t-test is used for smaller experiments with $\nu < 100$. In both cases, the p-value depends on the metric [mean](/stats-engine/metric-deltas) and [variance](/stats-engine/variance) computed for the test and control groups.

## Two-Sample Tests

### Two-Sided z-Test

The z-statistic (a.k.a. z-score) of a two-sample z-test can be computed in multiple equivalent formats:

$$

\begin{split}
Z &= \frac{\overline X_t - \overline X_c}{\sqrt{var(\overline X_t)+ var(\overline X_c)}} \\
&= \frac{\overline X_t - \overline X_c}{\sqrt{var(\Delta \overline{X})}} \\
&= \frac{\overline X_t - \overline X_c}{\sqrt{\sigma_{\overline{X}_t}^2 + \sigma_{\overline{X}_c}^2}}
\end{split}
$$

where:

* $Z$ is the observed z-statistic (not the z-critical value $Z_{\alpha/s}$)
* $var(\Delta \overline{X})$ is the variance of the absolute delta of means
* $var(\overline{X}_i)$ is the variance of sample means either control or treatment group (details [here](/stats-engine/variance))
* $\sigma_{\overline{X}_t}$ is the standard error of the mean of either control or treatment group (these are the terms you can find in Pulse under the Statistics tab of a metric)

The two-sided p-value is obtained from the standard normal cumulative distribution function:

$$

p-value = 2 \cdot \frac{1}{\sqrt{2\pi}} \int \limits _{-\infty}^{-|Z|}{e^{-t^2/2}dt}
$$

### Welch's t-test

For smaller sample sizes, Welch's t-test is the preferred statistical test for lower false positive rates in cases of unequal sizes and variances. In Pulse, Welch's t-test is automatically applied when the degrees of freedom $\nu < 100$.

We compute the t-statistic (a.k.a. t-score) identically as the two-sample z-statistic above. Additionally, we compute the degrees of freedom $\nu$ using:

$$

\nu = \frac{\left(var(\overline X_t) + var(\overline X_c)\right)^2}{\frac{var(\overline X_t)^2}{N_t - 1}+\frac{var(\overline X_c)^2}{N_c - 1}}\
:= \frac{var(\Delta\overline{X})^2}{\frac{var(\overline X_t)^2}{N_t - 1}+\frac{var(\overline X_c)^2}{N_c - 1}}
$$

The p-value is then obtained from the t-distribution with $\nu$ degrees of freedom.

### One-Sided Z-Test

The procedure for a one-sided z-test computes the z-statistic $Z$ in the same way as a two-sided test above.

The one-sided p-value is obtained from the standard normal cumulative distribution function as well, but with slight differences:

$$

p-value =
\begin{cases} 1 - \frac{1}{\sqrt{2\pi}} \int \limits _{-\infty}^{Z}{e^{-t^2/2}dt} &\text{if right-hand test}\\
\frac{1}{\sqrt{2\pi}} \int \limits _{-\infty}^{Z}{e^{-t^2/2}dt} &\text{if left-hand test}
\end{cases}
$$

where:

* $Z$ is computed above in the two-sided test. Note that this uses the signed z-statistic, not the absolute value of the z-statistic as in the two-sided p-value.
