# P-value

## Overview

In statistical hypothesis testing, the **p-value** is the probability of obtaining a result at least as extreme as a given data point, *assuming* the data point was the result of chance alone. The fact that p-values are based on this assumption is crucial to their correct interpretation. The p-value may be noted as a decimal: p-value < 0.05 means that the likelihood that the event occurred by chance alone is less than 5%. The lower the p-value, the less likely the event would occur by chance alone.^{[1]}

## Coin flipping example

For example, say an experiment is performed to determine if a coin flip is fair (50% chance of landing heads or tails), or unfairly biased, either toward heads (> 50% chance of landing heads) or toward tails (< 50% chance of landing heads). Since we consider both biased alternatives, a two-tailed test is performed. The null hypothesis is that the coin is fair, and that any deviations from the 50% rate can be ascribed to chance alone. Suppose that the experimental results show the coin turning up heads 14 times out of 20 total flips. The p-value of this result would be the chance of a fair coin landing on heads *at least* 14 times out of 20 flips (as larger values in this case are also less favorable to the null hypothesis of a fair coin) or landing on tails *at most* 6 times out of 20 flips. In this case the random variable *T* has a binomial distribution. The probability that 20 flips of a fair coin would result in 14 or more heads is 0.0577. Since this is a two-tailed test, the probability that 20 flips of the coin would result in 14 or more heads or 6 or less heads is 0.0577 x 2 = 0.115.

Generally, the smaller the p-value, the more people there are who would be willing to say that the results came from a biased coin.

## Interpretation

Generally, one rejects the null hypothesis if the p-value is smaller than or equal to the significance level, often represented by the Greek letter α (alpha). If the level is 0.05, then the results are only 5% likely to be as extraordinary as just seen, given that the null hypothesis is true.

In the above example, the calculated p-value exceeds 0.05, and thus the null hypothesis - that the observed result of 14 heads out of 20 flips can be ascribed to chance alone - is *not* rejected. Such a finding is often stated as being "not statistically significant at the 5% level".

However, had a single extra head been obtained, the resulting p-value would be 0.02. This time the null hypothesis - that the observed result of 15 heads out of 20 flips can be ascribed to chance alone - is rejected. Such a finding would be described as being "statistically significant at the 5% level".

Critics of p-values point out that the criterion used to decide "statistical significance" is based on the somewhat arbitrary choice of level (often set at 0.05). A proposed replacement for the p-value is p-rep.

## Frequent misunderstandings

There are several common misunderstandings about p-values.^{[2]}

- The p-value is
**not**the probability that the null hypothesis is true (claimed to justify the "rule" of considering as significant p-values closer to 0 (zero)).- In fact, frequentist statistics does not, and cannot, attach probabilities to hypotheses. Comparison of Bayesian and classical approaches shows that a p-value can be very close to zero while the posterior probability of the null is very close to unity. This is the
**Jeffreys-Lindley paradox**.

- In fact, frequentist statistics does not, and cannot, attach probabilities to hypotheses. Comparison of Bayesian and classical approaches shows that a p-value can be very close to zero while the posterior probability of the null is very close to unity. This is the
- The p-value is
**not**the probability that a finding is "merely a fluke" (again, justifying the "rule" of considering small p-values as "significant").- As the calculation of a p-value is based on the
*assumption*that a finding is the product of chance alone, it patently cannot simultaneously be used to gauge the probability of that assumption being true.

- As the calculation of a p-value is based on the
- The p-value is
**not**the probability of falsely rejecting the null hypothesis. This error is a version of the so-called prosecutor's fallacy. - The p-value is
**not**the probability that a replicating experiment would not yield the same conclusion. - 1 − (p-value) is
**not**the probability of the alternative hypothesis being true (see (1)). - The significance level of the test is
**not**determined by the p-value.- The significance level of a test is a value that should be decided upon by the agent interpreting the data before the data are viewed, and is compared against the p-value or any other statistic calculated after the test has been performed.

- The p-value does not indicate the size or importance of the observed effect (compare with effect size).

## See also

## External links

- Free p-Value Calculator for the Chi-Square test from Daniel Soper's
*Free Statistics Calculators*website. Computes the one-tailed probability value of a chi-square test (i.e., the area under the chi-square distribution from the chi-square value to infinity), given the chi-square value and the degrees of freedom. - Free p-Value Calculator for the Fisher F-test from Daniel Soper's
*Free Statistics Calculators*website. Computes the probability value of an F-test, given the F-value, numerator degrees of freedom, and denominator degrees of freedom. - Free p-Value Calculator for the Student t-test from Daniel Soper's
*Free Statistics Calculators*website. Computes the one-tailed and two-tailed probability values of a t-test, given the t-value and the degrees of freedom. - Understanding P-values, Jim Berger's page with links to various websites about p-values, and a Java applet that illustrates how the numerical values of p-values can give quite misleading impressions about the truth or falsity of the hypothesis under test.

## Additional reading

- Dallal GE (2007) Historical background to the origins of p-values and the choice of 0.05 as the cut-off for significance
- Hubbard R, Armstrong JS (2005) Historical background on the widespread confusion of the p-value (PDF)
- Fisher's method for combining independent tests of significance using their p-values

## References

- ↑ Duffy ME, Munroe BH, Jacobsen BS.
*Sifting the evidence — what's wrong with significance tests?*. Unknown parameter`|Edition=`

ignored (`|edition=`

suggested) (help); Unknown parameter`|book=`

ignored (help) - ↑ Sterne JAC, Smith GD (2001). "Sifting the evidence — what's wrong with significance tests?".
*BMJ*.**322**(7280): 226–231.

de:P-Wert it:Valore-p nl:P-waarde su:Ajén-P