Discrete-Time Random Processes

Reference:
Slides of EE4C03, TUD
Hayes M H. Statistical digital signal processing and modeling[M]. John Wiley & Sons, 2009.

Random Variables

Definitions

A random variable x x x is a function that assigns a number to each outcome of a random experiment.

Discrete-Time Random Processes

  • Probability distribution function
    F x ( α ) = Pr ⁡ ( x ≤ α ) F_x(\alpha)=\Pr(x\le \alpha) Fx(α)=Pr(xα)

  • Probability density function
    f x ( α ) = d d α F x ( α ) f_x(\alpha)=\frac{d}{d\alpha}F_x(\alpha) fx(α)=dαdFx(α)

  • Mean or expected value
    m x = E { x } = ∫ − ∞ ∞ α f x ( α ) d α m_x=E\{x\}=\int_{-\infty}^\infty \alpha f_x(\alpha)d\alpha mx=E{x}=αfx(α)dα

  • Variance
    σ x 2 = E { ( x − m x ) 2 } = ∫ − ∞ ∞ ( α − m x ) 2 f x ( α ) d α = E { x 2 } − m x 2 \sigma_x^2=E\{(x-m_x)^2\}=\int_{-\infty}^\infty(\alpha-m_x)^2f_x(\alpha)d\alpha=E\{x^2\}-m_x^2 σx2=E{(xmx)2}=(αmx)2fx(α)dα=E{x2}mx2

  • Joint probability distribution function
    F x , y ( α , β ) = Pr ⁡ { x ≤ α , y ≤ β } F_{x,y}(\alpha,\beta)=\Pr\{x\le \alpha,y\le \beta\} Fx,y(α,β)=Pr{xα,yβ}

  • Joint probability density function
    f x , y ( α , β ) = ∂ 2 ∂ α ∂ β F x , y ( α , β ) f_{x,y}(\alpha,\beta)=\frac{\partial ^2}{\partial \alpha\partial \beta}F_{x,y}(\alpha,\beta) fx,y(α,β)=αβ2Fx,y(α,β)

  • Correlation
    r x y = E { x y ∗ } r_{xy}=E\{xy^*\} rxy=E{xy}

  • Covariance
    c x y = C o v ( x , y ) = E { ( x − m x ) ( y − m y ) ∗ } = r x , y − m x m y ∗ c_{xy}=\mathrm{Cov}(x,y)=E\{(x-m_x)(y-m_y)^*\}=r_{x,y}-m_xm_y^* cxy=Cov(x,y)=E{(xmx)(ymy)}=rx,ymxmy

  • Correlation coefficient
    ρ x , y = c x , y σ x σ y = r x , y − m x m y ∗ σ x σ y , ∣ ρ x y ∣ ≤ 1 \rho_{x,y}=\frac{c_{x,y}}{\sigma_x\sigma_y}=\frac{r_{x,y}-m_xm_y^*}{\sigma_x\sigma_y},\qquad |\rho_{xy}|\le 1 ρx,y=σxσycx,y=σxσyrx,ymxmy,ρxy1
    Proof: Define an inner product on the set of random variables
    ⟨ x , y ⟩ : = E { x y ∗ } \langle x,y\rangle:=E\{xy^*\} x,y:=E{xy}
    From Cauchy–Schwarz inequality, we have
    ∣ ⟨ x − m x , y − m y ⟩ ∣ 2 ≤ ⟨ x − m x , x − m x ⟩ ⋅ ⟨ y − m y , y − m y ⟩ , |\langle x-m_x,y-m_y\rangle|^2\le\langle x-m_x,x-m_x\rangle\cdot\langle y-m_y,y-m_y\rangle, xmx,ymy2xmx,xmxymy,ymy,
    i.e. c x , y ≤ σ x σ y c_{x,y}\le \sigma_x \sigma_y cx,yσxσy.

    The ρ x , y \rho_{x,y} ρx,y here is very similar to the cos ⁡ θ \cos \theta cosθ when we do vector product.

  • Two random variables x x x and y y y are independent if
    f x , y ( α , β ) = f x ( α ) f y ( β ) f_{x,y}(\alpha,\beta)=f_x(\alpha)f_y(\beta) fx,y(α,β)=fx(α)fy(β)

  • Two random variables x x x and y y y are uncorrelated if
    E { x y ∗ } = E { x } E { y ∗ }  or  r x y = m x m y ∗  or  c x y = 0 E\{xy^*\}=E\{x\}E\{y^*\}\text{ or }r_{xy}=m_xm_y^*\text{ or }c_{xy}=0 E{xy}=E{x}E{y} or rxy=mxmy or cxy=0

  • Two random variables x x x and y y y are orthogonal if
    E { x y ∗ } = 0  or  r x y = 0 E\{xy^*\}=0\text{ or }r_{xy}=0 E{xy}=0 or rxy=0
    Orthogonal random variables are not necessarily uncorrelated

    But orthogonal    ⟺    \iff uncorrelated if m x = m y = 0 m_x=m_y=0 mx=my=0


Linear Mean-Square Estimation

In mean-square estimation, an estimate y ^ \hat y y^ is to be found that minimizes the mean-square error
ξ = E { ( y − y ^ ) 2 } \xi=E\left\{(y-\hat y)^2\right\} ξ=E{(yy^)2}
Although the solution to this problem generally leads to a nonlinear estimator, in many cases a linear estimator is preferred. In linear mean-square estimation, the estimator is constrained to be of the form
y ^ = a x + b \hat y=ax+b y^=ax+b
and the goal is to find the values for a a a and b b b that minimize the mean-square error
ξ = E { ( y − a x − b ) 2 } \xi=E\left\{(y-ax-b)^2\right\} ξ=E{(yaxb)2}
Solving the linear mean-square estimation problem may be accomplished by differentiating t with respect to a a a and b b b and setting the derivatives equal to zero as follows,
∂ ξ ∂ a = − 2 E { ( y − a x − b ) x } = 0 ∂ ξ ∂ b = − E { y − a x − b } = 0 (*) \begin{aligned} &\frac{\partial \xi}{\partial a}=-2E\{(y-ax-b)x\}=0\tag{*}\\ &\frac{\partial \xi}{\partial b}=-E\{y-ax-b\}=0 \end{aligned} aξ=2E{(yaxb)x}=0bξ=E{yaxb}=0(*)
Before solving these equations for a a a and b b b, note that (*) says that
E { ( y − y ^ ) x } = E { e x } = 0 E\{(y-\hat y)x\}=E\{ex\}=0 E{(yy^)x}=E{ex}=0
where e = y − y ^ e=y-\hat y e=yy^ is the estimation error. This relationship, known as the orthogonality principle, states that for the optimum linear predictor the estimation error will be orthogonal to the data x x x.

Solving equations above for a a a and b b b we find
a = E { x y } − m x m y σ x 2 = c x y σ x 2 = ρ x y σ y σ x , b = E { x 2 } m y − E { x y } m x σ x 2 = m y − a m x . \begin{aligned} a=&\frac{E\{xy\}-m_xm_y}{\sigma_x^2}=\frac{c_{xy}}{\sigma_x^2}=\rho_{xy} \frac{\sigma _y}{\sigma_x},\\ b=&\frac{E\{x^2\}m_y-E\{xy\}m_x}{\sigma_x^2}=m_y-am_x. \end{aligned} a=b=σx2E{xy}mxmy=σx2cxy=ρxyσxσy,σx2E{x2}myE{xy}mx=myamx.
Therefore, the estimator
y ^ = ρ x y σ y σ x ( x − m x ) + m y , \hat y=\rho_{xy}\frac{\sigma _y}{\sigma_x}(x-m_x)+m_y, y^=ρxyσxσy(xmx)+my,
and the minimum mean-square error can easily be calculated
ξ m i n = σ y 2 ( 1 − ρ x y 2 ) \xi_{min}=\sigma_y^2(1-\rho_{xy}^2) ξmin=σy2(1ρxy2)

Discrete-Time Random Processes

We see that the correlation coefficient provides a measure of the linear predictability between random variables. The closer ∣ ρ x y ∣ |\rho_{xy}| ρxy is to 1 1 1, the smaller the mean-square error in the estimation of y y y using a linear estimator.

Random Process

Definition

A random process x ( n ) x(n) x(n) is an indexed sequence of random variables (a “signal”)

Discrete-Time Random Processes

  • Mean and variance
    m x ( n ) = E { x ( n ) } σ x 2 ( n ) = E { ∣ x ( n ) − m x ( n ) ∣ 2 } m_x(n)=E\{x(n)\}\qquad \sigma_x^2(n)=E\{|x(n)-m_x(n)|^2\} mx(n)=E{x(n)}σx2(n)=E{x(n)mx(n)2}

  • Autocorrelation and autocovariance
    r x ( k , l ) = E { x ( k ) x ∗ ( l ) } c x ( k , l ) = E { [ x ( k ) − m x ( k ) ] [ x ( l ) − m x ( l ) ] ∗ } = r x ( k , l ) − m x ( k ) m x ∗ ( l ) \begin{aligned} &r_x(k,l)=E\{x(k)x^*(l)\}\\ &c_x(k,l)=E\{[x(k)-m_x(k)][x(l)-m_x(l)]^*\}=r_x(k,l)-m_x(k)m_x^*(l) \end{aligned} rx(k,l)=E{x(k)x(l)}cx(k,l)=E{[x(k)mx(k)][x(l)mx(l)]}=rx(k,l)mx(k)mx(l)

  • Cross-correlation and cross-covariance
    r x y ( k , l ) = E { x ( k ) y ∗ ( l ) } c x y ( k , l ) = E { [ x ( k ) − m x ( k ) ] [ y ( l ) − m y ( l ) ] ∗ } = r x y ( k , l ) − m x ( k ) m y ∗ ( l ) \begin{aligned} &r_{xy}(k,l)=E\{x(k)y^*(l)\}\\ &c_{xy}(k,l)=E\{[x(k)-m_x(k)][y(l)-m_y(l)]^*\}=r_{xy}(k,l)-m_x(k)m_y^*(l) \end{aligned} rxy(k,l)=E{x(k)y(l)}cxy(k,l)=E{[x(k)mx(k)][y(l)my(l)]}=rxy(k,l)mx(k)my(l)

    • two random process x ( n ) x(n) x(n) and y ( n ) y(n) y(n) are said to be uncorrelated if c x y ( k , l ) = 0 c_{xy}(k,l)=0 cxy(k,l)=0
    • two random process x ( n ) x(n) x(n) and y ( n ) y(n) y(n) are said to be orthogonal if r x y ( k , l ) = 0 r_{xy}(k,l)=0 rxy(k,l)=0
    • zero mean processes: uncorrelated    ⟺    \iff orthogonal

Property: If two random process x ( n ) x(n) x(n) and y ( n ) y(n) y(n) are uncorrelated, then the autocorrelation of the sum
z ( n ) = x ( n ) + y ( n ) z(n)=x(n)+y(n) z(n)=x(n)+y(n)
is equal to the sum of the autocorrelation of x ( n ) x(n) x(n) and y ( n ) y(n) y(n),
r z ( k , l ) = r x ( k , l ) + r y ( k , l ) r_z(k,l)=r_x(k,l)+r_y(k,l) rz(k,l)=rx(k,l)+ry(k,l)

Stationarity

  • First-order stationary if
    f x ( n ) ( α ) = f x ( n + k ) ( α ) . f_{x(n)}(\alpha)=f_{x(n+k)}(\alpha). fx(n)(α)=fx(n+k)(α).
    Implies m x ( n ) = m x ( 0 ) : = m x m_x(n)=m_x(0):=m_x mx(n)=mx(0):=mx.

  • Second-order stationary if
    f x ( n 1 ) , x ( n 2 ) ( α 1 , α 2 ) = f x ( n 1 + k ) , x ( n 2 + k ) ( α 1 , α 2 ) . f_{x(n_1),x(n_2)}(\alpha_1,\alpha_2)=f_{x(n_1+k),x(n_2+k)}(\alpha_1,\alpha_2). fx(n1),x(n2)(α1,α2)=fx(n1+k),x(n2+k)(α1,α2).
    Implies r x ( k , l ) = r x ( k − l , 0 ) : = r x ( k − l ) r_x(k,l)=r_x(k-l,0):=r_x(k-l) rx(k,l)=rx(kl,0):=rx(kl).

  • Stationarity in the strict sense, if the process is stationary for all orders L > 0 L>0 L>0

  • Wide-sense stationarity (WSS) if

    • m x ( n ) = m x m_x(n)=m_x mx(n)=mx
    • r x ( k , l ) = r x ( k − l ) r_x(k,l)=r_x(k-l) rx(k,l)=rx(kl)
    • c x ( 0 ) < ∞ c_x(0)<\infty cx(0)<
  • Joint wide-sense stationarity if x ( n ) x(n) x(n) and y ( n ) y(n) y(n) are wide-sense stationary and if the cross-correlation r x y ( k , l ) r_{xy}(k,l) rxy(k,l) depends only on the difference k − l k-l kl
    r x y ( k , l ) = r x y ( k − l , 0 ) : = r x y ( k − l ) r_{xy}(k,l)=r_{xy}(k-l,0):=r_{xy}(k-l) rxy(k,l)=rxy(kl,0):=rxy(kl)

  • Properties of WSS processes

    symmetry r x ( k ) = r x ∗ ( − k ) r_x(k)=r_x^*(-k) rx(k)=rx(k)
    mean-square value $r_x(0)=E{
    maximum value $r_x(0)\ge
    mean-square periodicity r x ( k 0 ) = r x ( 0 )    ⟺    r x ( k )  periodic with period  k 0 r_x(k_0)=r_x(0)\iff r_x(k)\text{ periodic with period }k_0 rx(k0)=rx(0)rx(k) periodic with period k0

    Proof of r x ( 0 ) ≥ ∣ r x ( k ) ∣ r_x(0)\ge |r_x(k)| rx(0)rx(k):

    Using Cauchy–Schwarz inequality,
    ∣ ⟨ x ( 0 ) , x ( k ) ⟩ ∣ ≤ ⟨ x ( 0 ) , x ( 0 ) ⟩ ⋅ ⟨ x ( k ) , x ( k ) ⟩ , |\langle x(0),x(k)\rangle|\le\sqrt{\langle x(0),x(0)\rangle\cdot \langle x(k),x(k)\rangle}, x(0),x(k)x(0),x(0)x(k),x(k) ,
    i.e.
    ∣ r x ( k ) ∣ ≤ r x ( 0 ) ⋅ r x ( 0 ) = r x ( 0 ) |r_x(k)|\le\sqrt{r_x(0)\cdot r_x(0)}=r_x(0) rx(k)rx(0)rx(0) =rx(0)

Autocorrelation and autocovariance matrices

We consider a WSS process x ( n ) x(n) x(n) and collect p + 1 p+1 p+1 samples in a vector
x = [ x ( 0 ) , x ( 1 ) , ⋯   , x ( p ) ] T \mathbf{x}=[x(0),x(1),\cdots,x(p)]^T x=[x(0),x(1),,x(p)]T

  • Autocorrelation matrix
    R x = E { x x H } = [ r x ( 0 ) r x ∗ ( 1 ) ⋯ r x ∗ ( p ) r x ( 1 ) r x ( 0 ) ⋯ r x ∗ ( p − 1 ) ⋮ ⋮ ⋮ r x ( p ) r x ( p − 1 ) ⋯ r x ( 0 ) ] \mathbf R_x=E\{\mathbf x \mathbf x^H\}=\left[\begin{matrix}r_x(0) & r_x^*(1) & \cdots &r_x^*(p)\\ r_x(1) & r_x(0) & \cdots &r_x^*(p-1)\\ \vdots & \vdots & &\vdots\\ r_x(p) & r_x(p-1) & \cdots &r_x(0) \end{matrix}\right] Rx=E{xxH}=rx(0)rx(1)rx(p)rx(1)rx(0)rx(p1)rx(p)rx(p1)rx(0)
    Properties

    • The autocorrelation matrix of a WSS random process x ( n ) \mathbf x (n) x(n) is a Hermitian Toeplitz matrix, R x = T o e p { r x ( 0 ) , r x ( 1 ) , ⋯   , r x ( p ) } \mathbf R_x=\mathrm{Toep}\{r_x(0),r_x(1),\cdots,r_x(p)\} Rx=Toep{rx(0),rx(1),,rx(p)}
    • The autocorrelation matrix of a WSS random process is nonnegative definite, R x > 0 \mathbf R_x>0 Rx>0. ( a H R x a = E { ∣ a H x ∣ 2 } \mathbf a^H\mathbf R_x \mathbf a=E\{|\mathbf a^H \mathbf x|^2\} aHRxa=E{aHx2})
    • The eigenvalues, λ k \lambda_k λk, of the autocorrelation matrix of a WSS random process are real-valued and nonnegative.
  • Autocovariance matrix
    C x = E { ( x − m x ) ( x − m x ) H } = R x − m x m x H \mathbf C_x=E\{(\mathbf x-\mathbf m_x)(\mathbf x-\mathbf m_x)^H\}=\mathbf R_x-\mathbf m_x \mathbf m_x^H Cx=E{(xmx)(xmx)H}=RxmxmxH
    where m x = [ m x , ⋯   , m x ] T \mathbf m_x=[m_x,\cdots,m_x]^T mx=[mx,,mx]T

Ergodicity

Discrete-Time Random Processes

When is the sample mean equal to the ensemble mean (expectation)?

  • Sample mean
    m ^ x ⟨ N ⟩ = 1 N ∑ n = 0 N − 1 x ( n ) \hat m_x\langle N \rangle=\frac{1}{N}\sum_{n=0}^{N-1}x(n) m^xN=N1n=0N1x(n)

  • A WSS process is ergodic in the mean if (mean-square convergence)
    lim ⁡ N → ∞ E { ∣ m ^ x ( N ) − m x ∣ 2 } = 0  or  lim ⁡ N → ∞ m ^ x ( N ) = m x    ⟺    E { m ^ x ( N ) } = m x (unbiased) and  lim ⁡ N → ∞ V a r { m ^ x ( N ) } = 0 \lim_{N\to \infty}E\{|\hat m_x(N)-m_x|^2\}=0\text{ or }\lim_{N\to \infty}\hat m_x(N)=m_x\\ \iff \\ E\{\hat m_x(N)\}=m_x\text{(unbiased) and }\lim_{N\to \infty}\mathrm{Var}\{\hat m_x(N)\}=0 NlimE{m^x(N)mx2}=0 or Nlimm^x(N)=mxE{m^x(N)}=mx(unbiased) and NlimVar{m^x(N)}=0
    From the definition of the sample mean it follows easily that the sample mean is unbiased for any wide-sense stationary process,
    E { m ^ x ( N ) } = 1 N ∑ n = 0 N − 1 E { x ( n ) } = m x E\{\hat m_x(N)\}=\frac{1}{N}\sum_{n=0}^{N-1}E\{x(n)\}=m_x E{m^x(N)}=N1n=0N1E{x(n)}=mx
    In order for the variance to go to zero, however, some constraints must be placed on the process x ( n ) x(n) x(n).
    V a r { m ^ x ( N ) } = E { ∣ m ^ x ( N ) − m x ∣ 2 } = E { ∣ 1 N ∑ n = 0 N − 1 [ x ( n ) − m x ] ∣ 2 } = 1 N 2 ∑ n = 0 N − 1 ∑ m = 0 N − 1 E { [ x ( m ) − m x ] [ x ( n ) − m x ] ∗ } = 1 N 2 ∑ n = 0 N − 1 ∑ m = 0 N − 1 c x ( m − n ) \begin{aligned} \mathrm{Var}\{\hat m_x(N)\}&=E\{|\hat m_x(N)-m_x|^2\}=E\left\{\left|\frac{1}{N}\sum_{n=0}^{N-1}[x(n)-m_x]\right|^2\right\}\\ &=\frac{1}{N^2}\sum_{n=0}^{N-1}\sum_{m=0}^{N-1}E\{[x(m)-m_x][x(n)-m_x]^*\}\\ &=\frac{1}{N^2}\sum_{n=0}^{N-1}\sum_{m=0}^{N-1}c_x(m-n) \end{aligned} Var{m^x(N)}=E{m^x(N)mx2}=EN1n=0N1[x(n)mx]2=N21n=0N1m=0N1E{[x(m)mx][x(n)mx]}=N21n=0N1m=0N1cx(mn)
    where c x ( m − n ) c_x(m-n) cx(mn) is the autocovariance of x ( n ) x(n) x(n). Grouping together common terms we may write the variance as
    V a r { m ^ x ( N ) } = 1 N 2 ∑ n = 0 N − 1 ∑ m = 0 N − 1 c x ( m − n ) = 1 N 2 ∑ k = − N + 1 N − 1 ( N − ∣ k ∣ ) c x ( k ) = 1 N ∑ k = − N + 1 N − 1 ( 1 − ∣ k ∣ N ) c x ( k ) \begin{aligned} \mathrm{Var}\{\hat m_x(N)\} &=\frac{1}{N^2}\sum_{n=0}^{N-1}\sum_{m=0}^{N-1}c_x(m-n)=\frac{1}{N^2}\sum_{k=-N+1}^{N-1}(N-|k|)c_x(k)\\ &=\frac{1}{N}\sum_{k=-N+1}^{N-1}(1-\frac{|k|}{N})c_x(k) \end{aligned} Var{m^x(N)}=N21n=0N1m=0N1cx(mn)=N21k=N+1N1(Nk)cx(k)=N1k=N+1N1(1Nk)cx(k)

    • Mean Ergodic Theorem 1

      Let x ( n ) x (n) x(n) be a WSS random process with autocovariance sequence c x ( k ) c_x(k) cx(k). A necessary and sufficient condition for x ( n ) x(n) x(n) to be ergodic in the mean is
      lim ⁡ N → ∞ 1 N ∑ k = − N + 1 N − 1 c x ( k ) = 0 \lim_{N\to\infty}\frac{1}{N}\sum_{k=-N+1}^{N-1}c_x(k)=0 NlimN1k=N+1N1cx(k)=0

    • Mean Ergodic Theorem 2

      Let x ( n ) x (n) x(n) be a WSS random process with autocovariance sequence c x ( k ) c_x(k) cx(k). Sufficient conditions for x ( n ) x(n) x(n) to be ergodic in the mean are that c x ( 0 ) < ∞ c_x(0) < \infty cx(0)< and
      lim ⁡ k → ∞ c x ( k ) = 0 \lim _{k\to\infty}c_x(k)=0 klimcx(k)=0
      In other words, a WSS process will be ergodic in the mean if it is asymptotically uncorrelated.

White noise

  • White noise is a discrete-time random process v ( n ) v(n) v(n) with autocovariance:
    c v ( k ) = σ v 2 δ ( k ) c_v(k)=\sigma_v^2\delta(k) cv(k)=σv2δ(k)
    i.e., a sequence of uncorrelated random variables, each having a variance of σ v 2 \sigma_v^2 σv2.

  • Defined only in terms of the form of its second-order moment → \to infinite variety of white noise random process: white Gaussian noise, Bernoulli process…

  • The power spectrum (defined later) of zero-mean white noise is constant:
    P v ( e j ω ) = ∑ k = − ∞ ∞ r v ( k ) e − j k ω = σ v 2 P_v(e^{j\omega})=\sum_{k=-\infty}^\infty r_v(k)e^{-jk\omega}=\sigma_v^2 Pv(ejω)=k=rv(k)ejkω=σv2

  • For complex white noise v ( n ) = v 1 ( n ) + j v 2 ( n ) v(n)=v_1(n)+jv_2(n) v(n)=v1(n)+jv2(n),
    E { ∣ v ( n ) ∣ 2 } = E { ∣ v 1 ( n ) ∣ 2 } + E { ∣ v 2 ( n ) ∣ 2 } , E\{|v(n)|^2\}=E\{|v_1(n)|^2\}+E\{|v_2(n)|^2\}, E{v(n)2}=E{v1(n)2}+E{v2(n)2},
    i.e., the variance of v ( n ) v(n) v(n) is the sum of the variances of the real and imaginary components, v 1 ( n ) v_1(n) v1(n) and v 2 ( n ) v_2(n) v2(n), respectively.

Power spectrum

  • The power spectrum of a WSS process is the DTFT of the autocorrelation:
    P x ( e j ω ) = ∑ k = − ∞ ∞ r x ( k ) e − j k ω . P_x(e^{j\omega})=\sum_{k=-\infty}^{\infty}r_x(k)e^{-jk\omega}. Px(ejω)=k=rx(k)ejkω.
    The autocorrelation sequence may be determined by taking the inverse discrete-time Fourier transform of P x ( e j w ) P_x (e^{jw}) Px(ejw),
    r x ( k ) = 1 2 π ∫ − π π P x ( e j k ω ) e j k ω d ω . r_x(k)=\frac{1}{2\pi}\int_{-\pi}^\pi P_x(e^{jk\omega})e^{jk\omega}d\omega. rx(k)=2π1ππPx(ejkω)ejkωdω.
    In some cases it may be more convenient to use the z-transform instead of the discrete-time Fourier transform:
    P x ( z ) = ∑ k = − ∞ ∞ r x ( k ) z − k . P_x(z)=\sum_{k=-\infty}^{\infty}r_x(k)z^{-k}. Px(z)=k=rx(k)zk.

  • Since the autocorrelation is conjugate symmetric, the power spectrum is real
    P x ( z ) = P x ∗ ( 1 / z ∗ ) ⇒ P x ( e j ω ) = P x ∗ ( e j ω ) P_x(z)=P_x^*(1/z^*) \quad \Rightarrow \quad P_x(e^{j\omega})=P_x^*(e^{j\omega}) Px(z)=Px(1/z)Px(ejω)=Px(ejω)

  • If the stochastic process is real, the power spectrum is even
    P x ( z ) = P x ∗ ( z ∗ ) ⇒ P x ( e j ω ) = P x ∗ ( e − j ω ) = P x ( e − j ω ) P_x(z)=P_x^*(z^*) \quad \Rightarrow \quad P_x(e^{j\omega})=P_x^*(e^{-j\omega})=P_x(e^{-j\omega}) Px(z)=Px(z)Px(ejω)=Px(ejω)=Px(ejω)

  • The power spectrum is nonnegative
    P x ( e j ω ) ≥ 0 P_x(e^{j\omega})\ge 0 Px(ejω)0

  • The total power is proportional to the area under the power spectrum
    E { ∣ x ( n ) ∣ 2 } = r x ( 0 ) = 1 2 π ∫ − π π P x ( e j ω ) d ω E\{|x(n)|^2\}=r_x(0)=\frac{1}{2\pi}\int_{-\pi}^\pi P_x(e^{j\omega}) d\omega E{x(n)2}=rx(0)=2π1ππPx(ejω)dω

  • The eigenvalues λ i \lambda_i λi of the n × n n\times n n×n autocorrelation matrix are upper and lower bounded by the maximum and minimum value, respectively, of the power spectrum
    min ⁡ ω P x ( e j ω ) ≤ λ i ≤ max ⁡ ω P x ( e j ω ) \min_\omega P_x(e^{j\omega})\le \lambda_i\le\max_\omega P_x(e^{j\omega}) ωminPx(ejω)λiωmaxPx(ejω)
    Proof: Let λ i \lambda_i λi and q i \mathbf q_i qi be the eigenvalues and eigenvectors, respectively, of the n × n n \times n n×n autocorrelation matrix R x \mathbf R_x Rx,
    R x q i = λ i q i ; i = 1 , 2 , ⋯   , n \mathbf R_x \mathbf q_i=\lambda_i \mathbf q_i;\quad i=1,2,\cdots,n Rxqi=λiqi;i=1,2,,n
    Since q i H R x q i = λ i q i H q i \mathbf q_i^H\mathbf R_x \mathbf q_i=\lambda_i \mathbf q_i^H\mathbf q_i qiHRxqi=λiqiHqi,
    λ i = q i H R x q i q i H q i . \lambda_i=\frac{\mathbf q_i^H\mathbf R_x \mathbf q_i}{ \mathbf q_i^H\mathbf q_i}. λi=qiHqiqiHRxqi.
    Expanding the Hermitian form in the numerator, we have
    q i H R x q i = ∑ k = 0 n − 1 ∑ l = 0 n − 1 q i ∗ ( k ) r x ( k − l ) q i ( l ) = 1 2 π ∑ k = 0 n − 1 ∑ l = 0 n − 1 q i ∗ ( k ) q i ( l ) ∫ − π π P x ( e j ω ) e j ω ( k − l ) d ω = 1 2 π ∫ − π π [ ∑ k = 0 n − 1 q i ∗ ( k ) e j ω k ] [ ∑ l = 0 n − 1 q i ( l ) e − j ω l ] P x ( e j ω ) d ω \begin{aligned} \mathbf q_i^H\mathbf R_x \mathbf q_i&=\sum_{k=0}^{n-1}\sum_{l=0}^{n-1}q_i^*(k)r_x(k-l)q_i(l)\\ &=\frac{1}{2\pi}\sum_{k=0}^{n-1}\sum_{l=0}^{n-1}q_i^*(k)q_i(l)\int_{-\pi}^\pi P_x(e^{j\omega})e^{j\omega(k-l)}d\omega\\ &=\frac{1}{2\pi}\int_{-\pi}^\pi[\sum_{k=0}^{n-1}q_i^*(k)e^{j\omega k}][\sum_{l=0}^{n-1}q_i(l) e^{-j\omega l}]P_x(e^{j\omega})d\omega \end{aligned} qiHRxqi=k=0n1l=0n1qi(k)rx(kl)qi(l)=2π1k=0n1l=0n1qi(k)qi(l)ππPx(ejω)ejω(kl)dω=2π1ππ[k=0n1qi(k)ejωk][l=0n1qi(l)ejωl]Px(ejω)dω
    With Q i ( e j ω ) = ∑ k = 0 n − 1 q i ( k ) e − j k ω Q_i(e^{j\omega})=\sum_{k=0}^{n-1}q_i(k)e^{-jk\omega} Qi(ejω)=k=0n1qi(k)ejkω,
    q i H R x q i = 1 2 π ∫ − π π P x ( e j ω ) ∣ Q i ( e j ω ) ∣ 2 d ω \mathbf q_i^H\mathbf R_x \mathbf q_i=\frac{1}{2\pi}\int_{-\pi}^\pi P_x(e^{j\omega})|Q_i(e^{j\omega})|^2d\omega qiHRxqi=2π1ππPx(ejω)Qi(ejω)2dω
    Repeating these steps we find
    q i H q i = 1 2 π ∫ − π π ∣ Q i ( e j ω ) ∣ 2 d ω \mathbf q_i^H \mathbf q_i=\frac{1}{2\pi}\int_{-\pi}^\pi |Q_i(e^{j\omega})|^2d\omega qiHqi=2π1ππQi(ejω)2dω
    Therefore,
    min ⁡ ω P x ( e j ω ) ≤ λ i = q i H R x q i q i H q i = 1 2 π ∫ − π π P x ( e j ω ) ∣ Q i ( e j ω ) ∣ 2 d ω 1 2 π ∫ − π π ∣ Q i ( e j ω ) ∣ 2 d ω ≤ max ⁡ ω P x ( e j ω ) \min_\omega P_x(e^{j\omega})\le \lambda_i=\frac{\mathbf q_i^H\mathbf R_x \mathbf q_i}{ \mathbf q_i^H\mathbf q_i}=\frac{\frac{1}{2\pi}\int_{-\pi}^\pi P_x(e^{j\omega})|Q_i(e^{j\omega})|^2d\omega}{\frac{1}{2\pi}\int_{-\pi}^\pi |Q_i(e^{j\omega})|^2d\omega}\le\max_\omega P_x(e^{j\omega}) ωminPx(ejω)λi=qiHqiqiHRxqi=2π1ππQi(ejω)2dω2π1ππPx(ejω)Qi(ejω)2dωωmaxPx(ejω)

  • The power spectrum is related to the mean of ∣ X ( e j ω ) ∣ 2 |X(e^{j\omega})|^2 X(ejω)2 as
    P x ( e j ω ) = lim ⁡ N → ∞ 1 2 N + 1 E { ∣ ∑ n = − N N x ( n ) e − j ω n ∣ 2 } P_x(e^{j\omega})=\lim_{N\to\infty}\frac{1}{2N+1}E\left\{\left|\sum_{n=-N}^N x(n)e^{-j\omega n}\right|^2\right\} Px(ejω)=Nlim2N+11En=NNx(n)ejωn2
    It can be interpreted as the ensemble average energy (right) asymptotically approaches the sample average energy (left)

    Proof: By denoting the ensemble average of the squared Fourier magnitude (energy) P N ( e j ω ) P_N(e^{j\omega}) PN(ejω)
    P N ( e j ω ) ≜ 1 2 N + 1 ∣ ∑ n = − N N x ( n ) e − j ω n ∣ 2 , P_N(e^{j\omega})\triangleq \frac{1}{2N+1}\left|\sum_{n=-N}^N x(n)e^{-j\omega n}\right|^2, PN(ejω)2N+11n=NNx(n)ejωn2,
    the original equation becomes
    P x ( e j ω ) = lim ⁡ N → ∞ E { P N ( e j ω ) } . P_x(e^{j\omega})=\lim_{N\to \infty}E\{P_N(e^{j\omega})\}. Px(ejω)=NlimE{PN(ejω)}.
    Since
    E { P N ( e j ω ) } = 1 2 N + 1 E { ( ∑ n = − N N x ( n ) e − j ω n ) ( ∑ m = − N N x ( m ) e − j ω m ) ∗ } = 1 2 N + 1 E { ∑ n = − N N ∑ m = − N N x ( n ) x ∗ ( m ) e − j ω ( n − m ) } = 1 2 N + 1 ∑ n = − N N ∑ m = − N N r x ( n − m ) e − j ω ( n − m ) = 1 2 N + 1 ∑ k = − 2 N 2 N ( 2 N + 1 − ∣ k ∣ ) r x ( k ) e − j k ω \begin{aligned} E\{P_N(e^{j\omega})\}&=\frac{1}{2N+1}E\left\{\left(\sum_{n=-N}^N x(n)e^{-j\omega n}\right)\left(\sum_{m=-N}^N x(m)e^{-j\omega m}\right)^*\right\}\\ &=\frac{1}{2N+1}E\left\{\sum_{n=-N}^N \sum_{m=-N}^N x(n)x^*(m)e^{-j\omega (n-m)}\right\}\\ &=\frac{1}{2N+1}\sum_{n=-N}^N \sum_{m=-N}^N r_x(n-m)e^{-j\omega (n-m)}\\ &=\frac{1}{2N+1}\sum_{k=-2N}^{2N}(2N+1-|k|)r_x(k)e^{-jk\omega} \end{aligned} E{PN(ejω)}=2N+11E{(n=NNx(n)ejωn)(m=NNx(m)ejωm)}=2N+11E{n=NNm=NNx(n)x(m)ejω(nm)}=2N+11n=NNm=NNrx(nm)ejω(nm)=2N+11k=2N2N(2N+1k)rx(k)ejkω
    Assuming that the autocorrelation sequence decays to zero fast enough so that ∑ k = − ∞ ∞ ∣ k ∣ r x ( k ) < ∞ \sum_{k=-\infty}^\infty|k|r_x(k)<\infty k=krx(k)<, letting N → ∞ N\to \infty N it follows that
    lim ⁡ N → ∞ E { P N ( e j ω ) } = ∑ k = − ∞ ∞ r x ( k ) e − j k ω = P x ( e j ω ) \lim_{N\to \infty}E\{P_N(e^{j\omega})\}=\sum_{k=-\infty}^\infty r_x(k)e^{-jk\omega}=P_x(e^{j\omega}) NlimE{PN(ejω)}=k=rx(k)ejkω=Px(ejω)

  • If x ( n ) x(n) x(n) has a nonzero mean or a periodicity, the power spectrum contains impulses

Filtering Random Processes

  • the relationship between the mean and autocorrelation of the input process to that of the output process

    Suppose x ( n ) x(n) x(n) is a WSS process with mean m x m_x mx and correlation r x ( k ) r_x(k) rx(k) that is filtered by a stable LSI filter with unit sample response h ( n ) h(n) h(n); then the output y ( n ) y(n) y(n) is also WSS with
    m y = m x H ( e j 0 ) r y ( k ) = r x ( k ) ∗ h ( k ) ∗ h ∗ ( − k ) ≜ r x y ( k ) ∗ h ∗ ( − k ) ≜ r x ( k ) ∗ r h ( k ) \begin{aligned} m_y&=m_xH(e^{j0})\\ r_y(k)&=r_x(k)*h(k)*h^*(-k)\\ &\triangleq r_{xy}(k)*h^*(-k)\\ &\triangleq r_x(k)*r_h(k) \end{aligned} myry(k)=mxH(ej0)=rx(k)h(k)h(k)rxy(k)h(k)rx(k)rh(k)

Discrete-Time Random Processes

Proof: The output y ( n ) y(n) y(n) is related to x ( n ) x(n) x(n) by the convolution sum
y ( n ) = x ( n ) ∗ h ( n ) = ∑ k = − ∞ ∞ h ( k ) x ( n − k ) y(n)=x(n)*h(n)=\sum_{k=-\infty}^\infty h(k)x(n-k) y(n)=x(n)h(n)=k=h(k)x(nk)
The mean of y ( n ) y(n) y(n) may be found by taking the expect value of both sides,
E { y ( n ) } = E { ∑ k = − ∞ ∞ h ( k ) x ( n − k ) } = ∑ k = − ∞ ∞ h ( k ) E { x ( n − k ) } = m x ∑ k = − ∞ ∞ h ( k ) = m x H ( e j 0 ) \begin{aligned} E\{y(n)\}&=E\{\sum_{k=-\infty}^\infty h(k)x(n-k)\}=\sum_{k=-\infty}^\infty h(k)E\{x(n-k)\}\\ &=m_x\sum_{k=-\infty}^\infty h(k)=m_xH(e^{j0}) \end{aligned} E{y(n)}=E{k=h(k)x(nk)}=k=h(k)E{x(nk)}=mxk=h(k)=mxH(ej0)
Before compute the autocorrelation of y ( n ) y(n) y(n), we can first compute the cross-correlation between x ( n ) x(n) x(n) and y ( n ) y(n) y(n) (it can also be done directly)
r y x ( n + k , n ) = E { y ( n + k ) x ∗ ( n ) } = E { ∑ l = − ∞ ∞ h ( l ) x ( n + k − l ) x ∗ ( n ) } = ∑ l = − ∞ ∞ h ( l ) E { x ( n + k − l ) x ∗ ( n ) } = ∑ l = − ∞ ∞ h ( l ) r x ( k − l ) = r x ( k ) ∗ h ( k ) \begin{aligned} r_{yx}(n+k,n)&=E\{y(n+k)x^*(n)\}=E\{\sum_{l=-\infty}^\infty h(l)x(n+k-l)x^*(n)\}\\ &=\sum_{l=-\infty}^\infty h(l)E\{x(n+k-l)x^*(n)\}\\ &=\sum_{l=-\infty}^\infty h(l) r_x(k-l)=r_x(k)*h(k) \end{aligned} ryx(n+k,n)=E{y(n+k)x(n)}=E{l=h(l)x(n+kl)x(n)}=l=h(l)E{x(n+kl)x(n)}=l=h(l)rx(kl)=rx(k)h(k)
The autocorrelation of y ( n ) y(n) y(n) may now be determined as follows,
r y ( n + k , n ) = E { y ( n + k ) y ∗ ( n ) } = E { y ( n + k ) ∑ l = − ∞ ∞ x ∗ ( l ) h ∗ ( n − l ) } = ∑ l = − ∞ ∞ h ∗ ( n − l ) E { y ( n + k ) x ∗ ( l ) } = ∑ l = − ∞ ∞ h ∗ ( n − l ) r y x ( n + k − l ) = r y x ( k ) ∗ h ∗ ( − k ) \begin{aligned} r_{y}(n+k,n)&=E\{y(n+k)y^*(n)\}=E\{y(n+k)\sum_{l=-\infty}^\infty x^*(l)h^*(n-l)\}\\ &=\sum_{l=-\infty}^\infty h^*(n-l)E\{y(n+k)x^*(l)\}\\ &=\sum_{l=-\infty}^\infty h^*(n-l) r_{yx}(n+k-l)=r_{yx}(k)*h^*(-k) \end{aligned} ry(n+k,n)=E{y(n+k)y(n)}=E{y(n+k)l=x(l)h(nl)}=l=h(nl)E{y(n+k)x(l)}=l=h(nl)ryx(n+kl)=ryx(k)h(k)
Therefore, autocorrelation sequence r y ( n + k , n ) r_y(n + k, n) ry(n+k,n) depends only on k k k, the difference between the indices n + k n + k n+k and n n n, i.e.,
r y ( k ) = r y x ( k ) ∗ h ∗ ( − k ) = r x ( k ) ∗ h ( k ) ∗ h ∗ ( − k ) = ∑ l = − ∞ ∞ ∑ m = − ∞ ∞ h ( l ) r x ( m − l + k ) h ∗ ( m ) r_y(k)=r_{yx}(k)*h^*(-k)=r_x(k)*h(k)*h^*(-k)=\sum_{l=-\infty}^\infty \sum_{m=-\infty}^\infty h(l)r_x(m-l+k)h^*(m) ry(k)=ryx(k)h(k)=rx(k)h(k)h(k)=l=m=h(l)rx(ml+k)h(m)
Another interpretation can be obtained by defining
r h ( k ) = h ( k ) ∗ h ∗ ( − k ) = ∑ n = − ∞ ∞ h ( n ) h ∗ ( n + k ) r_h(k)=h(k)*h^*(-k)=\sum_{n=-\infty}^{\infty}h(n)h^*(n+k) rh(k)=h(k)h(k)=n=h(n)h(n+k)
Then
r y ( k ) = r x ( k ) ∗ r h ( k ) r_y(k)=r_x(k)*r_h(k) ry(k)=rx(k)rh(k)

  • The power of y ( n ) y(n) y(n) is given by
    E { ∣ y ( n ) ∣ 2 } = r y ( 0 ) = ∑ l = − ∞ ∞ ∑ m = − ∞ ∞ h ( l ) r x ( m − l ) h ∗ ( m ) = h H R x h E\{|y(n)|^2\}=r_y(0)=\sum_{l=-\infty}^\infty \sum_{m=-\infty}^\infty h(l)r_x(m-l)h^*(m)=\mathbf h^H \mathbf R_x \mathbf h E{y(n)2}=ry(0)=l=m=h(l)rx(ml)h(m)=hHRxh
    where we assume h ( n ) h(n) h(n) zero outside [ 0 , N − 1 ] [0,N-1] [0,N1] and h = [ h ( 0 ) , h ( 1 ) , ⋯   , h ( N − 1 ) ] T \mathbf h=[h(0),h(1),\cdots,h(N-1)]^T h=[h(0),h(1),,h(N1)]T

  • In terms of the power spectrum,
    P y ( e j ω ) = P x ( e j ω ) ∣ H ( e j ω ) ∣ 2 P y ( z ) = P x ( z ) H ( z ) H ∗ ( 1 / z ∗ ) \begin{aligned} P_y(e^{j\omega})&=P_x(e^{j\omega})|H(e^{j\omega)}|^2\\ P_y(z)&=P_x(z)H(z)H^*(1/z^*) \end{aligned} Py(ejω)Py(z)=Px(ejω)H(ejω)2=Px(z)H(z)H(1/z)
    So assuming no pole/zero cancelations between P x ( z ) P_x(z) Px(z) and H ( z ) H(z) H(z), if H ( z ) H(z) H(z) has a pole (zero) at z = z 0 z=z_0 z=z0, then P y ( z ) P_y(z) Py(z) also has a pole (zero) at z = z 0 z=z_0 z=z0 and another at the conjugate reciprocal location z = 1 / z 0 ∗ z=1/z_0^* z=1/z0.

  • If H ( e j ω ) H(e^{j\omega}) H(ejω) is a narrow-band bandpass filter with center frequency ω 0 \omega_0 ω0, bandwidth Δ ω \Delta \omega Δω, and magnitude 1 1 1, then the output power is
    E { ∣ y ( n ) ∣ 2 } = r y ( 0 ) = 1 2 π ∫ − ∞ ∞ ∣ H ( e j ω ) ∣ 2 P x ( e j ω ) d ω ≈ Δ ω 2 π P x ( e j ω 0 ) E\{|y(n)|^2\}=r_y(0)=\frac{1}{2\pi}\int_{-\infty}^\infty |H(e^{j\omega})|^2P_x(e^{j\omega})d\omega \approx \frac{\Delta \omega}{2\pi}P_x(e^{j\omega_0}) E{y(n)2}=ry(0)=2π1H(ejω)2Px(ejω)dω2πΔωPx(ejω0)
    een P x ( z ) P_x(z) Px(z) and H ( z ) H(z) H(z), if H ( z ) H(z) H(z) has a pole (zero) at z = z 0 z=z_0 z=z0, then P y ( z ) P_y(z) Py(z) also has a pole (zero) at z = z 0 z=z_0 z=z0 and another at the conjugate reciprocal location z = 1 / z 0 ∗ z=1/z_0^* z=1/z0.

  • If H ( e j ω ) H(e^{j\omega}) H(ejω) is a narrow-band bandpass filter with center frequency ω 0 \omega_0 ω0, bandwidth Δ ω \Delta \omega Δω, and magnitude 1 1 1, then the output power is
    E { ∣ y ( n ) ∣ 2 } = r y ( 0 ) = 1 2 π ∫ − ∞ ∞ ∣ H ( e j ω ) ∣ 2 P x ( e j ω ) d ω ≈ Δ ω 2 π P x ( e j ω 0 ) E\{|y(n)|^2\}=r_y(0)=\frac{1}{2\pi}\int_{-\infty}^\infty |H(e^{j\omega})|^2P_x(e^{j\omega})d\omega \approx \frac{\Delta \omega}{2\pi}P_x(e^{j\omega_0}) E{y(n)2}=ry(0)=2π1H(ejω)2Px(ejω)dω2πΔωPx(ejω0)
    Since E { ∣ y ( n ) ∣ 2 } E\{|y(n)|^2\} E{y(n)2} represents the power of x ( n ) x(n) x(n) within Δ ω \Delta \omega Δω, the power spectrum P x ( e j ω ) P_x(e^{j\omega}) Px(ejω) may be viewed as a density function that describes how the power in x ( n ) x(n) x(n) varies with ω \omega ω, i.e., it describes how the power is distributed over frequency ω \omega ω.