# Wold's theorem

In statistics, Wold's theorem or Wold representation theorem, named after Herman Wold, says that every covariance-stationary time series $(Y_{t})_{t}$ can be written as an infinite moving average (MA($\infty$ )) process of its innovation process. This is also known as the Wold decomposition theorem.

Formally

$Y_{t}=\sum _{j=1}^{\infty }b_{j}\varepsilon _{t-j}+\varepsilon _{t}+\eta _{t},$ where:

$Y_{t}$ is the considered time series,

$\varepsilon _{t}$ is the innovation process associated to the process $Y_{t}$ - it may also be interpreted as the forecast error in prediction applications (actual minus predicted value),

$b$ is the infinite vector of moving average weights (coefficients or parameters), which is absolutely summable $\sum _{j=0}^{\infty }|b_{j}|$ < $\infty$ , and with $b_{0}=1$ ,

and $\eta _{t}$ is a deterministic component, which is zero in the absence of trends in $(Y_{t})_{t}$ .

The usefulness of the Wold Theorem is that it allows to approximate the dynamic evolution of a variable $Y_{t}$ by a linear model. If the innovations $\varepsilon _{t}$ are independent, then the linear model is the only possible representation. However, when $\varepsilon _{t}$ is merely an uncorrelated but not independent sequence, then the linear model exists but it is not the only representation of the dynamic dependence of the series. In this latter case, it is possible that the linear model is not very useful, and we may have a nonlinear model relating the observed value of $Y_{t}$ with its past evolution. The Wold representation depends on an infinite number of parameters and, consequently, it is not directly useful in practice. To solve this problem, it is approximated using models that have a finite number of parameters, possibly adding an autoregressive part to the equation. 