Yule-Simon distribution

Jump to: navigation, search
Yule-Simon
Probability mass function
Plot of the Yule-Simon PMF
Yule-Simon PMF on a log-log scale. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.)
Cumulative distribution function
Plot of the Yule-Simon CMF
Yule-Simon CMF. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.)
Parameters shape (real)
Support
Probability mass function (pmf)
Cumulative distribution function (cdf)
Mean for
Median
Mode
Variance for
Skewness for
Excess kurtosis for
Entropy
Moment-generating function (mgf)
Characteristic function

In probability and statistics, the Yule-Simon distribution is a discrete probability distribution named after Udny Yule and Herbert Simon. Simon originally called it the Yule distribution.

The probability mass function of the Yule-Simon(ρ) distribution is

for integer and real , where is the beta function. Equivalently the pmf can be written in terms of the falling factorial as

where is the gamma function. Thus, if is an integer,

The probability mass function f has the property that for sufficiently large k we have

This means that the tail of the Yule-Simon distribution is a realization of Zipf's law: can be used to model, for example, the relative frequency of the th most frequent word in a large collection of text, which according to Zipf's law is inversely proportional to a (typically small) power of .

Occurrence

The Yule-Simon distribution arises as a continuous mixture of geometric distributions. Specifically, assume that follows an exponential distribution with scale or rate :

Then a Yule-Simon distributed variable has the following geometric distribution:

The pmf of a geometric distribution is

for . The Yule-Simon pmf is then the following exponential-geometric mixture distribution:

Generalizations

The two-parameter generalization of the original Yule distribution replaces the beta function with an incomplete beta function. The probability mass function of the generalized Yule-Simon(ρ, α) distribution is defined as

with . For the ordinary Yule-Simon(ρ) distribution is obtained as a special case. The use of the incomplete beta function has the effect of introducing an exponential cutoff in the upper tail.

File:Yule-Simon distribution.png
Plot of the Yule-Simon(1) distribution (red) and its asymptotic Zipf law (blue)

References

  • Herbert A. Simon, On a Class of Skew Distribution Functions, Biometrika 42(3/4): 425–440, December 1955.
  • Colin Rose and Murray D. Smith, Mathematical Statistics with Mathematica. New York: Springer, 2002, ISBN 0-387-95234-9. (See page 107, where it is called the "Yule distribution".)



Linked-in.jpg