# Wishart distribution

Parameters Probability density function Cumulative distribution function ${\displaystyle n>0\!}$ deg. of freedom (real)${\displaystyle \mathbf {V} >0\,}$ scale matrix ( pos. def) ${\displaystyle \mathbf {W} \!}$ is positive definite ${\displaystyle {\frac {\left|\mathbf {W} \right|^{\frac {n-p-1}{2}}}{2^{\frac {np}{2}}\left|{\mathbf {V} }\right|^{\frac {n}{2}}\Gamma _{p}({\frac {n}{2}})}}\exp \left(-{\frac {1}{2}}{\rm {Tr}}({\mathbf {V} }^{-1}\mathbf {W} )\right)}$ ${\displaystyle n\mathbf {V} }$ ${\displaystyle (n-p-1)\mathbf {V} {\text{ for }}n\geq p+1}$ ${\displaystyle \Theta \mapsto \left|{\mathbf {I} }-2i\,{\mathbf {\Theta } }{\mathbf {V} }\right|^{-n/2}}$

In statistics, the Wishart distribution, named in honor of John Wishart, is any of a family of probability distributions for nonnegative-definite matrix-valued random variables ("random matrices"). These distributions are of great importance in the estimation of covariance matrices in multivariate statistics.

## Definition

Suppose X is an n × p matrix, each row of which is independently drawn from p-variate normal distribution with zero mean:

${\displaystyle X_{(i)}{=}(x_{i}^{1},\dots ,x_{i}^{p})^{T}\sim N_{p}(0,V),}$

Then the Wishart distribution is the probability distribution of the p×p random matrix

${\displaystyle S={X}^{T}{X},\,\!}$

where S is known as the scatter matrix. One indicates that S has that probability distribution by writing

${\displaystyle S\sim W_{p}(V,n).}$

The positive integer n is the number of degrees of freedom. Sometimes this is written W(Vpn).

If p = 1 and V = 1 then this distribution is a chi-square distribution with n degrees of freedom.

## Occurrence

The Wishart distribution arises frequently in likelihood-ratio tests in multivariate statistical analysis. It also arises in the spectral theory of random matrices.

## Probability density function

The Wishart distribution can be characterized by its probability density function, as follows.

Let W be a p × p symmetric matrix of random variables that is positive definite. Let V be a (fixed) positive definite matrix of size p × p.

Then, if np, then W has a Wishart distribution with n degrees of freedom if it has a probability density function fW given by

${\displaystyle f_{\mathbf {W} }(w)={\frac {\left|w\right|^{(n-p-1)/2}\exp \left[-{\rm {trace}}({\mathbf {V} }^{-1}w/2)\right]}{2^{np/2}\left|{\mathbf {V} }\right|^{n/2}\Gamma _{p}(n/2)}}}$

where Γp(·) is the multivariate gamma function defined as

${\displaystyle \Gamma _{p}(n/2)=\pi ^{p(p-1)/4}\Pi _{j=1}^{p}\Gamma \left[(n+1-j)/2\right].}$

In fact the above definition can be extended to any real n > p − 1.

## Characteristic function

The characteristic function of the Wishart distribution is

${\displaystyle \Theta \mapsto \left|{\mathbf {I} }-2i\,{\mathbf {\Theta } }{\mathbf {V} }\right|^{-n/2}.}$

In other words,

${\displaystyle \Theta \mapsto {\mathcal {E}}\left\{\mathrm {exp} \left[i\cdot \mathrm {trace} ({\mathbf {W} }{\mathbf {\Theta } })\right]\right\}=\left|{\mathbf {I} }-2i{\mathbf {\Theta } }{\mathbf {V} }\right|^{-n/2}}$

where ${\displaystyle {\mathcal {E}}(\cdot )}$ denotes expectation.

## Theorem

If ${\displaystyle \scriptstyle {\mathbf {W} }}$ has a Wishart distribution with m degrees of freedom and variance matrix ${\displaystyle \scriptstyle {\mathbf {V} }}$—write ${\displaystyle \scriptstyle {\mathbf {W} }\sim {\mathbf {W} }_{p}({\mathbf {V} },m)}$—and ${\displaystyle \scriptstyle {\mathbf {C} }}$ is a q × p matrix of rank q, then

${\displaystyle {\mathbf {C} }{\mathbf {W} }{\mathbf {C} '}\sim {\mathbf {W} }_{q}\left({\mathbf {C} }{\mathbf {V} }{\mathbf {C} '},m\right).}$

### Corollary 1

If ${\displaystyle {\mathbf {z} }}$ is a nonzero ${\displaystyle p\times 1}$ constant vector, then ${\displaystyle {\mathbf {z} '}{\mathbf {W} }{\mathbf {z} }\sim \sigma _{z}^{2}\chi _{m}^{2}}$.

In this case, ${\displaystyle \chi _{m}^{2}}$ is the chi-square distribution and ${\displaystyle \sigma _{z}^{2}={\mathbf {z} '}{\mathbf {V} }{\mathbf {z} }}$ (note that ${\displaystyle \sigma _{z}^{2}}$ is a constant; it is positive because ${\displaystyle {\mathbf {V} }}$ is positive definite).

### Corollary 2

Consider the case where ${\displaystyle {\mathbf {z} '}=(0,\ldots ,0,1,0,\ldots ,0)}$ (that is, the j-th element is one and all others zero). Then corollary 1 above shows that

${\displaystyle w_{jj}\sim \sigma _{jj}\chi _{m}^{2}}$

gives the marginal distribution of each of the elements on the matrix's diagonal.

Noted statistician George Seber points out that the Wishart distribution is not called the "multivariate chi-square distribution" because the marginal distribution of the off-diagonal elements is not chi-square. Seber prefers to reserve the term multivariate for the case when all univariate marginals belong to the same family.

## Estimator of the multivariate normal distribution

The Wishart distribution is the probability distribution of the maximum-likelihood estimator (MLE) of the covariance matrix of a multivariate normal distribution. The derivation of the MLE is perhaps surprisingly subtle and elegant. It involves the spectral theorem and the reason why it can be better to view a scalar as the trace of a 1×1 matrix than as a mere scalar. See estimation of covariance matrices.

## Drawing values from the distribution

The following procedure is due to Smith & Hocking [1]. One can sample random p × p matrices from a p-variate Wishart distribution with scale matrix ${\displaystyle {\textbf {V}}}$ and n degrees of freedom (for ${\displaystyle n\geq p}$) as follows:

1. Generate a random p × p lower triangular matrix ${\displaystyle {\textbf {A}}}$ such that:
• ${\displaystyle a_{ii}=(\chi _{n-i+1}^{2})^{1/2}}$, i.e. ${\displaystyle a_{ii}}$ is the square root of a sample taken from a chi-square distribution ${\displaystyle \chi _{n-i+1}^{2}}$
• ${\displaystyle a_{ij}}$, for ${\displaystyle j, is sampled from a standard normal distribution ${\displaystyle N_{1}(0,1)}$
2. Compute the Cholesky decomposition of ${\displaystyle {\textbf {V}}={\textbf {L}}{\textbf {L}}^{T}}$.
3. Compute the matrix ${\displaystyle {\textbf {X}}={\textbf {L}}{\textbf {A}}{\textbf {A}}^{T}{\textbf {L}}^{T}}$. At this point, ${\displaystyle {\textbf {X}}}$ is a sample from the Wishart distribution ${\displaystyle W_{p}({\textbf {V}},n)}$.

Note that if ${\displaystyle {\textbf {V}}={\textbf {I}}}$, the identity matrix, then the sample can be directly obtained from ${\displaystyle {\textbf {X}}={\textbf {A}}{\textbf {A}}^{T}}$ since the Cholesky decomposition of ${\displaystyle {\textbf {V}}={\textbf {I}}{\textbf {I}}^{T}}$.