Discrete-Time Random Processes

Reference:
Slides of EE4C03, TUD
Hayes M H. Statistical digital signal processing and modeling[M]. John Wiley & Sons, 2009.

Content

Random Variables

Definitions

A random variable x x x is a function that assigns a number to each outcome of a random experiment.

Discrete-Time Random Processes

Probability distribution function
F x ( α ) = Pr ⁡ ( x ≤ α ) F_x(\alpha)=\Pr(x\le \alpha) Fx(α)=Pr(x≤α)
Probability density function
f x ( α ) = d d α F x ( α ) f_x(\alpha)=\frac{d}{d\alpha}F_x(\alpha) fx(α)=dαdFx(α)
Mean or expected value
m x = E { x } = ∫ − ∞ ∞ α f x ( α ) d α m_x=E\{x\}=\int_{-\infty}^\infty \alpha f_x(\alpha)d\alpha mx=E{x}=∫−∞∞αfx(α)dα
Variance
σ x 2 = E { ( x − m x ) 2 } = ∫ − ∞ ∞ ( α − m x ) 2 f x ( α ) d α = E { x 2 } − m x 2 \sigma_x^2=E\{(x-m_x)^2\}=\int_{-\infty}^\infty(\alpha-m_x)^2f_x(\alpha)d\alpha=E\{x^2\}-m_x^2 σx2=E{(x−mx)2}=∫−∞∞(α−mx)2fx(α)dα=E{x2}−mx2
Joint probability distribution function
F x , y ( α , β ) = Pr ⁡ { x ≤ α , y ≤ β } F_{x,y}(\alpha,\beta)=\Pr\{x\le \alpha,y\le \beta\} Fx,y(α,β)=Pr{x≤α,y≤β}
Joint probability density function
f x , y ( α , β ) = ∂ 2 ∂ α ∂ β F x , y ( α , β ) f_{x,y}(\alpha,\beta)=\frac{\partial ^2}{\partial \alpha\partial \beta}F_{x,y}(\alpha,\beta) fx,y(α,β)=∂α∂β∂2Fx,y(α,β)
Correlation
r x y = E { x y ∗ } r_{xy}=E\{xy^*\} rxy=E{xy∗}
Covariance
c x y = C o v ( x , y ) = E { ( x − m x ) ( y − m y ) ∗ } = r x , y − m x m y ∗ c_{xy}=\mathrm{Cov}(x,y)=E\{(x-m_x)(y-m_y)^*\}=r_{x,y}-m_xm_y^* cxy=Cov(x,y)=E{(x−mx)(y−my)∗}=rx,y−mxmy∗
Correlation coefficient
ρ x , y = c x , y σ x σ y = r x , y − m x m y ∗ σ x σ y , ∣ ρ x y ∣ ≤ 1 \rho_{x,y}=\frac{c_{x,y}}{\sigma_x\sigma_y}=\frac{r_{x,y}-m_xm_y^*}{\sigma_x\sigma_y},\qquad |\rho_{xy}|\le 1 ρx,y=σxσycx,y=σxσyrx,y−mxmy∗,∣ρxy∣≤1
Proof: Define an inner product on the set of random variables
⟨ x , y ⟩ : = E { x y ∗ } \langle x,y\rangle:=E\{xy^*\} ⟨x,y⟩:=E{xy∗}
From Cauchy–Schwarz inequality, we have
∣ ⟨ x − m x , y − m y ⟩ ∣ 2 ≤ ⟨ x − m x , x − m x ⟩ ⋅ ⟨ y − m y , y − m y ⟩ , |\langle x-m_x,y-m_y\rangle|^2\le\langle x-m_x,x-m_x\rangle\cdot\langle y-m_y,y-m_y\rangle, ∣⟨x−mx,y−my⟩∣2≤⟨x−mx,x−mx⟩⋅⟨y−my,y−my⟩,
i.e. c x , y ≤ σ x σ y c_{x,y}\le \sigma_x \sigma_y cx,y≤σxσy.

The ρ x , y \rho_{x,y} ρx,y here is very similar to the cos ⁡ θ \cos \theta cosθ when we do vector product.
Two random variables x x x and y y y are independent if
f x , y ( α , β ) = f x ( α ) f y ( β ) f_{x,y}(\alpha,\beta)=f_x(\alpha)f_y(\beta) fx,y(α,β)=fx(α)fy(β)
Two random variables x x x and y y y are uncorrelated if
E { x y ∗ } = E { x } E { y ∗ } or r x y = m x m y ∗ or c x y = 0 E\{xy^*\}=E\{x\}E\{y^*\}\text{ or }r_{xy}=m_xm_y^*\text{ or }c_{xy}=0 E{xy∗}=E{x}E{y∗} or rxy=mxmy∗ or cxy=0
Two random variables x x x and y y y are orthogonal if
E { x y ∗ } = 0 or r x y = 0 E\{xy^*\}=0\text{ or }r_{xy}=0 E{xy∗}=0 or rxy=0
Orthogonal random variables are not necessarily uncorrelated

But orthogonal ⟺ \iff ⟺ uncorrelated if m x = m y = 0 m_x=m_y=0 mx=my=0

Linear Mean-Square Estimation

In mean-square estimation, an estimate y ^ \hat y y^ is to be found that minimizes the mean-square error
ξ = E { ( y − y ^ ) 2 } \xi=E\left\{(y-\hat y)^2\right\} ξ=E{(y−y^)2}
Although the solution to this problem generally leads to a nonlinear estimator, in many cases a linear estimator is preferred. In linear mean-square estimation, the estimator is constrained to be of the form
y ^ = a x + b \hat y=ax+b y^=ax+b
and the goal is to find the values for a a a and b b b that minimize the mean-square error
ξ = E { ( y − a x − b ) 2 } \xi=E\left\{(y-ax-b)^2\right\} ξ=E{(y−ax−b)2}
Solving the linear mean-square estimation problem may be accomplished by differentiating t with respect to a a a and b b b and setting the derivatives equal to zero as follows,
∂ ξ ∂ a = − 2 E { ( y − a x − b ) x } = 0 ∂ ξ ∂ b = − E { y − a x − b } = 0 (*) \begin{aligned} &\frac{\partial \xi}{\partial a}=-2E\{(y-ax-b)x\}=0\tag{*}\\ &\frac{\partial \xi}{\partial b}=-E\{y-ax-b\}=0 \end{aligned} ∂a∂ξ=−2E{(y−ax−b)x}=0∂b∂ξ=−E{y−ax−b}=0(*)
Before solving these equations for a a a and b b b, note that (*) says that
E { ( y − y ^ ) x } = E { e x } = 0 E\{(y-\hat y)x\}=E\{ex\}=0 E{(y−y^)x}=E{ex}=0
where e = y − y ^ e=y-\hat y e=y−y^ is the estimation error. This relationship, known as the orthogonality principle, states that for the optimum linear predictor the estimation error will be orthogonal to the data x x x.

Solving equations above for a a a and b b b we find
a = E { x y } − m x m y σ x 2 = c x y σ x 2 = ρ x y σ y σ x , b = E { x 2 } m y − E { x y } m x σ x 2 = m y − a m x . \begin{aligned} a=&\frac{E\{xy\}-m_xm_y}{\sigma_x^2}=\frac{c_{xy}}{\sigma_x^2}=\rho_{xy} \frac{\sigma _y}{\sigma_x},\\ b=&\frac{E\{x^2\}m_y-E\{xy\}m_x}{\sigma_x^2}=m_y-am_x. \end{aligned} a=b=σx2E{xy}−mxmy=σx2cxy=ρxyσxσy,σx2E{x2}my−E{xy}mx=my−amx.
Therefore, the estimator
y ^ = ρ x y σ y σ x ( x − m x ) + m y , \hat y=\rho_{xy}\frac{\sigma _y}{\sigma_x}(x-m_x)+m_y, y^=ρxyσxσy(x−mx)+my,
and the minimum mean-square error can easily be calculated
ξ m i n = σ y 2 ( 1 − ρ x y 2 ) \xi_{min}=\sigma_y^2(1-\rho_{xy}^2) ξmin=σy2(1−ρxy2)

Discrete-Time Random Processes

We see that the correlation coefficient provides a measure of the linear predictability between random variables. The closer ∣ ρ x y ∣ |\rho_{xy}| ∣ρxy∣ is to 1 1 1, the smaller the mean-square error in the estimation of y y y using a linear estimator.

Random Process

Definition

A random process x ( n ) x(n) x(n) is an indexed sequence of random variables (a “signal”)

Discrete-Time Random Processes

Mean and variance
m x ( n ) = E { x ( n ) } σ x 2 ( n ) = E { ∣ x ( n ) − m x ( n ) ∣ 2 } m_x(n)=E\{x(n)\}\qquad \sigma_x^2(n)=E\{|x(n)-m_x(n)|^2\} mx(n)=E{x(n)}σx2(n)=E{∣x(n)−mx(n)∣2}
Autocorrelation and autocovariance
r x ( k , l ) = E { x ( k ) x ∗ ( l ) } c x ( k , l ) = E { [ x ( k ) − m x ( k ) ] [ x ( l ) − m x ( l ) ] ∗ } = r x ( k , l ) − m x ( k ) m x ∗ ( l ) \begin{aligned} &r_x(k,l)=E\{x(k)x^*(l)\}\\ &c_x(k,l)=E\{[x(k)-m_x(k)][x(l)-m_x(l)]^*\}=r_x(k,l)-m_x(k)m_x^*(l) \end{aligned} rx(k,l)=E{x(k)x∗(l)}cx(k,l)=E{[x(k)−mx(k)][x(l)−mx(l)]∗}=rx(k,l)−mx(k)mx∗(l)
Cross-correlation and cross-covariance
r x y ( k , l ) = E { x ( k ) y ∗ ( l ) } c x y ( k , l ) = E { [ x ( k ) − m x ( k ) ] [ y ( l ) − m y ( l ) ] ∗ } = r x y ( k , l ) − m x ( k ) m y ∗ ( l ) \begin{aligned} &r_{xy}(k,l)=E\{x(k)y^*(l)\}\\ &c_{xy}(k,l)=E\{[x(k)-m_x(k)][y(l)-m_y(l)]^*\}=r_{xy}(k,l)-m_x(k)m_y^*(l) \end{aligned} rxy(k,l)=E{x(k)y∗(l)}cxy(k,l)=E{[x(k)−mx(k)][y(l)−my(l)]∗}=rxy(k,l)−mx(k)my∗(l)
- two random process x ( n ) x(n) x(n) and y ( n ) y(n) y(n) are said to be uncorrelated if c x y ( k , l ) = 0 c_{xy}(k,l)=0 cxy(k,l)=0
- two random process x ( n ) x(n) x(n) and y ( n ) y(n) y(n) are said to be orthogonal if r x y ( k , l ) = 0 r_{xy}(k,l)=0 rxy(k,l)=0
- zero mean processes: uncorrelated ⟺ \iff ⟺ orthogonal

Property: If two random process x ( n ) x(n) x(n) and y ( n ) y(n) y(n) are uncorrelated, then the autocorrelation of the sum
z ( n ) = x ( n ) + y ( n ) z(n)=x(n)+y(n) z(n)=x(n)+y(n)
is equal to the sum of the autocorrelation of x ( n ) x(n) x(n) and y ( n ) y(n) y(n),
r z ( k , l ) = r x ( k , l ) + r y ( k , l ) r_z(k,l)=r_x(k,l)+r_y(k,l) rz(k,l)=rx(k,l)+ry(k,l)

Stationarity

First-order stationary if
f x ( n ) ( α ) = f x ( n + k ) ( α ) . f_{x(n)}(\alpha)=f_{x(n+k)}(\alpha). fx(n)(α)=fx(n+k)(α).
Implies m x ( n ) = m x ( 0 ) : = m x m_x(n)=m_x(0):=m_x mx(n)=mx(0):=mx.
Second-order stationary if
f x ( n 1 ) , x ( n 2 ) ( α 1 , α 2 ) = f x ( n 1 + k ) , x ( n 2 + k ) ( α 1 , α 2 ) . f_{x(n_1),x(n_2)}(\alpha_1,\alpha_2)=f_{x(n_1+k),x(n_2+k)}(\alpha_1,\alpha_2). fx(n1),x(n2)(α1,α2)=fx(n1+k),x(n2+k)(α1,α2).
Implies r x ( k , l ) = r x ( k − l , 0 ) : = r x ( k − l ) r_x(k,l)=r_x(k-l,0):=r_x(k-l) rx(k,l)=rx(k−l,0):=rx(k−l).
Stationarity in the strict sense, if the process is stationary for all orders L > 0 L>0 L>0
Wide-sense stationarity (WSS) if
- m x ( n ) = m x m_x(n)=m_x mx(n)=mx
- r x ( k , l ) = r x ( k − l ) r_x(k,l)=r_x(k-l) rx(k,l)=rx(k−l)
- c x ( 0 ) < ∞ c_x(0)<\infty cx(0)<∞
Joint wide-sense stationarity if x ( n ) x(n) x(n) and y ( n ) y(n) y(n) are wide-sense stationary and if the cross-correlation r x y ( k , l ) r_{xy}(k,l) rxy(k,l) depends only on the difference k − l k-l k−l
r x y ( k , l ) = r x y ( k − l , 0 ) : = r x y ( k − l ) r_{xy}(k,l)=r_{xy}(k-l,0):=r_{xy}(k-l) rxy(k,l)=rxy(k−l,0):=rxy(k−l)

Properties of WSS processes

symmetry	*r x ( k ) = r x ∗ ( − k ) r_x(k)=r_x^(-k) rx(k)=rx∗(−k)**
mean-square value	$r_x(0)=E{
maximum value	$r_x(0)\ge
mean-square periodicity	r x ( k 0 ) = r x ( 0 ) ⟺ r x ( k ) periodic with period k 0 r_x(k_0)=r_x(0)\iff r_x(k)\text{ periodic with period }k_0 rx(k0)=rx(0)⟺rx(k) periodic with period k0

Proof of r x ( 0 ) ≥ ∣ r x ( k ) ∣ r_x(0)\ge |r_x(k)| rx(0)≥∣rx(k)∣:

Using Cauchy–Schwarz inequality,
∣ ⟨ x ( 0 ) , x ( k ) ⟩ ∣ ≤ ⟨ x ( 0 ) , x ( 0 ) ⟩ ⋅ ⟨ x ( k ) , x ( k ) ⟩ , |\langle x(0),x(k)\rangle|\le\sqrt{\langle x(0),x(0)\rangle\cdot \langle x(k),x(k)\rangle}, ∣⟨x(0),x(k)⟩∣≤⟨x(0),x(0)⟩⋅⟨x(k),x(k)⟩ ,
i.e.
∣ r x ( k ) ∣ ≤ r x ( 0 ) ⋅ r x ( 0 ) = r x ( 0 ) |r_x(k)|\le\sqrt{r_x(0)\cdot r_x(0)}=r_x(0) ∣rx(k)∣≤rx(0)⋅rx(0) =rx(0)

Autocorrelation and autocovariance matrices

We consider a WSS process x ( n ) x(n) x(n) and collect p + 1 p+1 p+1 samples in a vector
x = [ x ( 0 ) , x ( 1 ) , ⋯ , x ( p ) ] T \mathbf{x}=[x(0),x(1),\cdots,x(p)]^T x=[x(0),x(1),⋯,x(p)]T

Autocorrelation matrix
R x = E { x x H } = [ r x ( 0 ) r x ∗ ( 1 ) ⋯ r x ∗ ( p ) r x ( 1 ) r x ( 0 ) ⋯ r x ∗ ( p − 1 ) ⋮ ⋮ ⋮ r x ( p ) r x ( p − 1 ) ⋯ r x ( 0 ) ] \mathbf R_x=E\{\mathbf x \mathbf x^H\}=\left[\begin{matrix}r_x(0) & r_x^*(1) & \cdots &r_x^*(p)\\ r_x(1) & r_x(0) & \cdots &r_x^*(p-1)\\ \vdots & \vdots & &\vdots\\ r_x(p) & r_x(p-1) & \cdots &r_x(0) \end{matrix}\right] Rx=E{xxH}=⎣⎢⎢⎢⎡rx(0)rx(1)⋮rx(p)rx∗(1)rx(0)⋮rx(p−1)⋯⋯⋯rx∗(p)rx∗(p−1)⋮rx(0)⎦⎥⎥⎥⎤
Properties
- The autocorrelation matrix of a WSS random process x ( n ) \mathbf x (n) x(n) is a Hermitian Toeplitz matrix, R x = T o e p { r x ( 0 ) , r x ( 1 ) , ⋯ , r x ( p ) } \mathbf R_x=\mathrm{Toep}\{r_x(0),r_x(1),\cdots,r_x(p)\} Rx=Toep{rx(0),rx(1),⋯,rx(p)}
- The autocorrelation matrix of a WSS random process is nonnegative definite, R x > 0 \mathbf R_x>0 Rx>0. ( a H R x a = E { ∣ a H x ∣ 2 } \mathbf a^H\mathbf R_x \mathbf a=E\{|\mathbf a^H \mathbf x|^2\} aHRxa=E{∣aHx∣2})
- The eigenvalues, λ k \lambda_k λk, of the autocorrelation matrix of a WSS random process are real-valued and nonnegative.
Autocovariance matrix
C x = E { ( x − m x ) ( x − m x ) H } = R x − m x m x H \mathbf C_x=E\{(\mathbf x-\mathbf m_x)(\mathbf x-\mathbf m_x)^H\}=\mathbf R_x-\mathbf m_x \mathbf m_x^H Cx=E{(x−mx)(x−mx)H}=Rx−mxmxH
where m x = [ m x , ⋯ , m x ] T \mathbf m_x=[m_x,\cdots,m_x]^T mx=[mx,⋯,mx]T

Ergodicity

Discrete-Time Random Processes

When is the sample mean equal to the ensemble mean (expectation)?

Sample mean
m ^ x ⟨ N ⟩ = 1 N ∑ n = 0 N − 1 x ( n ) \hat m_x\langle N \rangle=\frac{1}{N}\sum_{n=0}^{N-1}x(n) m^x⟨N⟩=N1n=0∑N−1x(n)
A WSS process is ergodic in the mean if (mean-square convergence)
lim ⁡ N → ∞ E { ∣ m ^ x ( N ) − m x ∣ 2 } = 0 or lim ⁡ N → ∞ m ^ x ( N ) = m x ⟺ E { m ^ x ( N ) } = m x (unbiased) and lim ⁡ N → ∞ V a r { m ^ x ( N ) } = 0 \lim_{N\to \infty}E\{|\hat m_x(N)-m_x|^2\}=0\text{ or }\lim_{N\to \infty}\hat m_x(N)=m_x\\ \iff \\ E\{\hat m_x(N)\}=m_x\text{(unbiased) and }\lim_{N\to \infty}\mathrm{Var}\{\hat m_x(N)\}=0 N→∞limE{∣m^x(N)−mx∣2}=0 or N→∞limm^x(N)=mx⟺E{m^x(N)}=mx(unbiased) and N→∞limVar{m^x(N)}=0
From the definition of the sample mean it follows easily that the sample mean is unbiased for any wide-sense stationary process,
E { m ^ x ( N ) } = 1 N ∑ n = 0 N − 1 E { x ( n ) } = m x E\{\hat m_x(N)\}=\frac{1}{N}\sum_{n=0}^{N-1}E\{x(n)\}=m_x E{m^x(N)}=N1n=0∑N−1E{x(n)}=mx
In order for the variance to go to zero, however, some constraints must be placed on the process x ( n ) x(n) x(n).
V a r { m ^ x ( N ) } = E { ∣ m ^ x ( N ) − m x ∣ 2 } = E { ∣ 1 N ∑ n = 0 N − 1 [ x ( n ) − m x ] ∣ 2 } = 1 N 2 ∑ n = 0 N − 1 ∑ m = 0 N − 1 E { [ x ( m ) − m x ] [ x ( n ) − m x ] ∗ } = 1 N 2 ∑ n = 0 N − 1 ∑ m = 0 N − 1 c x ( m − n ) \begin{aligned} \mathrm{Var}\{\hat m_x(N)\}&=E\{|\hat m_x(N)-m_x|^2\}=E\left\{\left|\frac{1}{N}\sum_{n=0}^{N-1}[x(n)-m_x]\right|^2\right\}\\ &=\frac{1}{N^2}\sum_{n=0}^{N-1}\sum_{m=0}^{N-1}E\{[x(m)-m_x][x(n)-m_x]^*\}\\ &=\frac{1}{N^2}\sum_{n=0}^{N-1}\sum_{m=0}^{N-1}c_x(m-n) \end{aligned} Var{m^x(N)}=E{∣m^x(N)−mx∣2}=E⎩⎨⎧∣∣∣∣∣N1n=0∑N−1[x(n)−mx]∣∣∣∣∣2⎭⎬⎫=N21n=0∑N−1m=0∑N−1E{[x(m)−mx][x(n)−mx]∗}=N21n=0∑N−1m=0∑N−1cx(m−n)
where c x ( m − n ) c_x(m-n) cx(m−n) is the autocovariance of x ( n ) x(n) x(n). Grouping together common terms we may write the variance as
V a r { m ^ x ( N ) } = 1 N 2 ∑ n = 0 N − 1 ∑ m = 0 N − 1 c x ( m − n ) = 1 N 2 ∑ k = − N + 1 N − 1 ( N − ∣ k ∣ ) c x ( k ) = 1 N ∑ k = − N + 1 N − 1 ( 1 − ∣ k ∣ N ) c x ( k ) \begin{aligned} \mathrm{Var}\{\hat m_x(N)\} &=\frac{1}{N^2}\sum_{n=0}^{N-1}\sum_{m=0}^{N-1}c_x(m-n)=\frac{1}{N^2}\sum_{k=-N+1}^{N-1}(N-|k|)c_x(k)\\ &=\frac{1}{N}\sum_{k=-N+1}^{N-1}(1-\frac{|k|}{N})c_x(k) \end{aligned} Var{m^x(N)}=N21n=0∑N−1m=0∑N−1cx(m−n)=N21k=−N+1∑N−1(N−∣k∣)cx(k)=N1k=−N+1∑N−1(1−N∣k∣)cx(k)
- Mean Ergodic Theorem 1
  
  Let x ( n ) x (n) x(n) be a WSS random process with autocovariance sequence c x ( k ) c_x(k) cx(k). A necessary and sufficient condition for x ( n ) x(n) x(n) to be ergodic in the mean is
  lim ⁡ N → ∞ 1 N ∑ k = − N + 1 N − 1 c x ( k ) = 0 \lim_{N\to\infty}\frac{1}{N}\sum_{k=-N+1}^{N-1}c_x(k)=0 N→∞limN1k=−N+1∑N−1cx(k)=0
- Mean Ergodic Theorem 2
  
  Let x ( n ) x (n) x(n) be a WSS random process with autocovariance sequence c x ( k ) c_x(k) cx(k). Sufficient conditions for x ( n ) x(n) x(n) to be ergodic in the mean are that c x ( 0 ) < ∞ c_x(0) < \infty cx(0)<∞ and
  lim ⁡ k → ∞ c x ( k ) = 0 \lim _{k\to\infty}c_x(k)=0 k→∞limcx(k)=0
  In other words, a WSS process will be ergodic in the mean if it is asymptotically uncorrelated.

White noise

White noise is a discrete-time random process v ( n ) v(n) v(n) with autocovariance:
c v ( k ) = σ v 2 δ ( k ) c_v(k)=\sigma_v^2\delta(k) cv(k)=σv2δ(k)
i.e., a sequence of uncorrelated random variables, each having a variance of σ v 2 \sigma_v^2 σv2.
Defined only in terms of the form of its second-order moment → \to → infinite variety of white noise random process: white Gaussian noise, Bernoulli process…
The power spectrum (defined later) of zero-mean white noise is constant:
P v ( e j ω ) = ∑ k = − ∞ ∞ r v ( k ) e − j k ω = σ v 2 P_v(e^{j\omega})=\sum_{k=-\infty}^\infty r_v(k)e^{-jk\omega}=\sigma_v^2 Pv(ejω)=k=−∞∑∞rv(k)e−jkω=σv2
For complex white noise v ( n ) = v 1 ( n ) + j v 2 ( n ) v(n)=v_1(n)+jv_2(n) v(n)=v1(n)+jv2(n),
E { ∣ v ( n ) ∣ 2 } = E { ∣ v 1 ( n ) ∣ 2 } + E { ∣ v 2 ( n ) ∣ 2 } , E\{|v(n)|^2\}=E\{|v_1(n)|^2\}+E\{|v_2(n)|^2\}, E{∣v(n)∣2}=E{∣v1(n)∣2}+E{∣v2(n)∣2},
i.e., the variance of v ( n ) v(n) v(n) is the sum of the variances of the real and imaginary components, v 1 ( n ) v_1(n) v1(n) and v 2 ( n ) v_2(n) v2(n), respectively.

Power spectrum

The power spectrum of a WSS process is the DTFT of the autocorrelation:
P x ( e j ω ) = ∑ k = − ∞ ∞ r x ( k ) e − j k ω . P_x(e^{j\omega})=\sum_{k=-\infty}^{\infty}r_x(k)e^{-jk\omega}. Px(ejω)=k=−∞∑∞rx(k)e−jkω.
The autocorrelation sequence may be determined by taking the inverse discrete-time Fourier transform of P x ( e j w ) P_x (e^{jw}) Px(ejw),
r x ( k ) = 1 2 π ∫ − π π P x ( e j k ω ) e j k ω d ω . r_x(k)=\frac{1}{2\pi}\int_{-\pi}^\pi P_x(e^{jk\omega})e^{jk\omega}d\omega. rx(k)=2π1∫−ππPx(ejkω)ejkωdω.
In some cases it may be more convenient to use the z-transform instead of the discrete-time Fourier transform:
P x ( z ) = ∑ k = − ∞ ∞ r x ( k ) z − k . P_x(z)=\sum_{k=-\infty}^{\infty}r_x(k)z^{-k}. Px(z)=k=−∞∑∞rx(k)z−k.
Since the autocorrelation is conjugate symmetric, the power spectrum is real
P x ( z ) = P x ∗ ( 1 / z ∗ ) ⇒ P x ( e j ω ) = P x ∗ ( e j ω ) P_x(z)=P_x^*(1/z^*) \quad \Rightarrow \quad P_x(e^{j\omega})=P_x^*(e^{j\omega}) Px(z)=Px∗(1/z∗)⇒Px(ejω)=Px∗(ejω)
If the stochastic process is real, the power spectrum is even
P x ( z ) = P x ∗ ( z ∗ ) ⇒ P x ( e j ω ) = P x ∗ ( e − j ω ) = P x ( e − j ω ) P_x(z)=P_x^*(z^*) \quad \Rightarrow \quad P_x(e^{j\omega})=P_x^*(e^{-j\omega})=P_x(e^{-j\omega}) Px(z)=Px∗(z∗)⇒Px(ejω)=Px∗(e−jω)=Px(e−jω)
The power spectrum is nonnegative
P x ( e j ω ) ≥ 0 P_x(e^{j\omega})\ge 0 Px(ejω)≥0
The total power is proportional to the area under the power spectrum
E { ∣ x ( n ) ∣ 2 } = r x ( 0 ) = 1 2 π ∫ − π π P x ( e j ω ) d ω E\{|x(n)|^2\}=r_x(0)=\frac{1}{2\pi}\int_{-\pi}^\pi P_x(e^{j\omega}) d\omega E{∣x(n)∣2}=rx(0)=2π1∫−ππPx(ejω)dω
The eigenvalues λ i \lambda_i λi of the n × n n\times n n×n autocorrelation matrix are upper and lower bounded by the maximum and minimum value, respectively, of the power spectrum
min ⁡ ω P x ( e j ω ) ≤ λ i ≤ max ⁡ ω P x ( e j ω ) \min_\omega P_x(e^{j\omega})\le \lambda_i\le\max_\omega P_x(e^{j\omega}) ωminPx(ejω)≤λi≤ωmaxPx(ejω)
Proof: Let λ i \lambda_i λi and q i \mathbf q_i qi be the eigenvalues and eigenvectors, respectively, of the n × n n \times n n×n autocorrelation matrix R x \mathbf R_x Rx,
R x q i = λ i q i ; i = 1 , 2 , ⋯ , n \mathbf R_x \mathbf q_i=\lambda_i \mathbf q_i;\quad i=1,2,\cdots,n Rxqi=λiqi;i=1,2,⋯,n
Since q i H R x q i = λ i q i H q i \mathbf q_i^H\mathbf R_x \mathbf q_i=\lambda_i \mathbf q_i^H\mathbf q_i qiHRxqi=λiqiHqi,
λ i = q i H R x q i q i H q i . \lambda_i=\frac{\mathbf q_i^H\mathbf R_x \mathbf q_i}{ \mathbf q_i^H\mathbf q_i}. λi=qiHqiqiHRxqi.
Expanding the Hermitian form in the numerator, we have
q i H R x q i = ∑ k = 0 n − 1 ∑ l = 0 n − 1 q i ∗ ( k ) r x ( k − l ) q i ( l ) = 1 2 π ∑ k = 0 n − 1 ∑ l = 0 n − 1 q i ∗ ( k ) q i ( l ) ∫ − π π P x ( e j ω ) e j ω ( k − l ) d ω = 1 2 π ∫ − π π [ ∑ k = 0 n − 1 q i ∗ ( k ) e j ω k ] [ ∑ l = 0 n − 1 q i ( l ) e − j ω l ] P x ( e j ω ) d ω \begin{aligned} \mathbf q_i^H\mathbf R_x \mathbf q_i&=\sum_{k=0}^{n-1}\sum_{l=0}^{n-1}q_i^*(k)r_x(k-l)q_i(l)\\ &=\frac{1}{2\pi}\sum_{k=0}^{n-1}\sum_{l=0}^{n-1}q_i^*(k)q_i(l)\int_{-\pi}^\pi P_x(e^{j\omega})e^{j\omega(k-l)}d\omega\\ &=\frac{1}{2\pi}\int_{-\pi}^\pi[\sum_{k=0}^{n-1}q_i^*(k)e^{j\omega k}][\sum_{l=0}^{n-1}q_i(l) e^{-j\omega l}]P_x(e^{j\omega})d\omega \end{aligned} qiHRxqi=k=0∑n−1l=0∑n−1qi∗(k)rx(k−l)qi(l)=2π1k=0∑n−1l=0∑n−1qi∗(k)qi(l)∫−ππPx(ejω)ejω(k−l)dω=2π1∫−ππ[k=0∑n−1qi∗(k)ejωk][l=0∑n−1qi(l)e−jωl]Px(ejω)dω
With Q i ( e j ω ) = ∑ k = 0 n − 1 q i ( k ) e − j k ω Q_i(e^{j\omega})=\sum_{k=0}^{n-1}q_i(k)e^{-jk\omega} Qi(ejω)=∑k=0n−1qi(k)e−jkω,
q i H R x q i = 1 2 π ∫ − π π P x ( e j ω ) ∣ Q i ( e j ω ) ∣ 2 d ω \mathbf q_i^H\mathbf R_x \mathbf q_i=\frac{1}{2\pi}\int_{-\pi}^\pi P_x(e^{j\omega})|Q_i(e^{j\omega})|^2d\omega qiHRxqi=2π1∫−ππPx(ejω)∣Qi(ejω)∣2dω
Repeating these steps we find
q i H q i = 1 2 π ∫ − π π ∣ Q i ( e j ω ) ∣ 2 d ω \mathbf q_i^H \mathbf q_i=\frac{1}{2\pi}\int_{-\pi}^\pi |Q_i(e^{j\omega})|^2d\omega qiHqi=2π1∫−ππ∣Qi(ejω)∣2dω
Therefore,
min ⁡ ω P x ( e j ω ) ≤ λ i = q i H R x q i q i H q i = 1 2 π ∫ − π π P x ( e j ω ) ∣ Q i ( e j ω ) ∣ 2 d ω 1 2 π ∫ − π π ∣ Q i ( e j ω ) ∣ 2 d ω ≤ max ⁡ ω P x ( e j ω ) \min_\omega P_x(e^{j\omega})\le \lambda_i=\frac{\mathbf q_i^H\mathbf R_x \mathbf q_i}{ \mathbf q_i^H\mathbf q_i}=\frac{\frac{1}{2\pi}\int_{-\pi}^\pi P_x(e^{j\omega})|Q_i(e^{j\omega})|^2d\omega}{\frac{1}{2\pi}\int_{-\pi}^\pi |Q_i(e^{j\omega})|^2d\omega}\le\max_\omega P_x(e^{j\omega}) ωminPx(ejω)≤λi=qiHqiqiHRxqi=2π1∫−ππ∣Qi(ejω)∣2dω2π1∫−ππPx(ejω)∣Qi(ejω)∣2dω≤ωmaxPx(ejω)
The power spectrum is related to the mean of ∣ X ( e j ω ) ∣ 2 |X(e^{j\omega})|^2 ∣X(ejω)∣2 as
P x ( e j ω ) = lim ⁡ N → ∞ 1 2 N + 1 E { ∣ ∑ n = − N N x ( n ) e − j ω n ∣ 2 } P_x(e^{j\omega})=\lim_{N\to\infty}\frac{1}{2N+1}E\left\{\left|\sum_{n=-N}^N x(n)e^{-j\omega n}\right|^2\right\} Px(ejω)=N→∞lim2N+11E⎩⎨⎧∣∣∣∣∣n=−N∑Nx(n)e−jωn∣∣∣∣∣2⎭⎬⎫
It can be interpreted as the ensemble average energy (right) asymptotically approaches the sample average energy (left)

Proof: By denoting the ensemble average of the squared Fourier magnitude (energy) P N ( e j ω ) P_N(e^{j\omega}) PN(ejω)
P N ( e j ω ) ≜ 1 2 N + 1 ∣ ∑ n = − N N x ( n ) e − j ω n ∣ 2 , P_N(e^{j\omega})\triangleq \frac{1}{2N+1}\left|\sum_{n=-N}^N x(n)e^{-j\omega n}\right|^2, PN(ejω)≜2N+11∣∣∣∣∣n=−N∑Nx(n)e−jωn∣∣∣∣∣2,
the original equation becomes
P x ( e j ω ) = lim ⁡ N → ∞ E { P N ( e j ω ) } . P_x(e^{j\omega})=\lim_{N\to \infty}E\{P_N(e^{j\omega})\}. Px(ejω)=N→∞limE{PN(ejω)}.
Since
E { P N ( e j ω ) } = 1 2 N + 1 E { ( ∑ n = − N N x ( n ) e − j ω n ) ( ∑ m = − N N x ( m ) e − j ω m ) ∗ } = 1 2 N + 1 E { ∑ n = − N N ∑ m = − N N x ( n ) x ∗ ( m ) e − j ω ( n − m ) } = 1 2 N + 1 ∑ n = − N N ∑ m = − N N r x ( n − m ) e − j ω ( n − m ) = 1 2 N + 1 ∑ k = − 2 N 2 N ( 2 N + 1 − ∣ k ∣ ) r x ( k ) e − j k ω \begin{aligned} E\{P_N(e^{j\omega})\}&=\frac{1}{2N+1}E\left\{\left(\sum_{n=-N}^N x(n)e^{-j\omega n}\right)\left(\sum_{m=-N}^N x(m)e^{-j\omega m}\right)^*\right\}\\ &=\frac{1}{2N+1}E\left\{\sum_{n=-N}^N \sum_{m=-N}^N x(n)x^*(m)e^{-j\omega (n-m)}\right\}\\ &=\frac{1}{2N+1}\sum_{n=-N}^N \sum_{m=-N}^N r_x(n-m)e^{-j\omega (n-m)}\\ &=\frac{1}{2N+1}\sum_{k=-2N}^{2N}(2N+1-|k|)r_x(k)e^{-jk\omega} \end{aligned} E{PN(ejω)}=2N+11E{(n=−N∑Nx(n)e−jωn)(m=−N∑Nx(m)e−jωm)∗}=2N+11E{n=−N∑Nm=−N∑Nx(n)x∗(m)e−jω(n−m)}=2N+11n=−N∑Nm=−N∑Nrx(n−m)e−jω(n−m)=2N+11k=−2N∑2N(2N+1−∣k∣)rx(k)e−jkω
Assuming that the autocorrelation sequence decays to zero fast enough so that ∑ k = − ∞ ∞ ∣ k ∣ r x ( k ) < ∞ \sum_{k=-\infty}^\infty|k|r_x(k)<\infty ∑k=−∞∞∣k∣rx(k)<∞, letting N → ∞ N\to \infty N→∞ it follows that
lim ⁡ N → ∞ E { P N ( e j ω ) } = ∑ k = − ∞ ∞ r x ( k ) e − j k ω = P x ( e j ω ) \lim_{N\to \infty}E\{P_N(e^{j\omega})\}=\sum_{k=-\infty}^\infty r_x(k)e^{-jk\omega}=P_x(e^{j\omega}) N→∞limE{PN(ejω)}=k=−∞∑∞rx(k)e−jkω=Px(ejω)
If x ( n ) x(n) x(n) has a nonzero mean or a periodicity, the power spectrum contains impulses

Filtering Random Processes

the relationship between the mean and autocorrelation of the input process to that of the output process

Suppose x ( n ) x(n) x(n) is a WSS process with mean m x m_x mx and correlation r x ( k ) r_x(k) rx(k) that is filtered by a stable LSI filter with unit sample response h ( n ) h(n) h(n); then the output y ( n ) y(n) y(n) is also WSS with
m y = m x H ( e j 0 ) r y ( k ) = r x ( k ) ∗ h ( k ) ∗ h ∗ ( − k ) ≜ r x y ( k ) ∗ h ∗ ( − k ) ≜ r x ( k ) ∗ r h ( k ) \begin{aligned} m_y&=m_xH(e^{j0})\\ r_y(k)&=r_x(k)*h(k)*h^*(-k)\\ &\triangleq r_{xy}(k)*h^*(-k)\\ &\triangleq r_x(k)*r_h(k) \end{aligned} myry(k)=mxH(ej0)=rx(k)∗h(k)∗h∗(−k)≜rxy(k)∗h∗(−k)≜rx(k)∗rh(k)

Discrete-Time Random Processes

Proof: The output y ( n ) y(n) y(n) is related to x ( n ) x(n) x(n) by the convolution sum
y ( n ) = x ( n ) ∗ h ( n ) = ∑ k = − ∞ ∞ h ( k ) x ( n − k ) y(n)=x(n)*h(n)=\sum_{k=-\infty}^\infty h(k)x(n-k) y(n)=x(n)∗h(n)=k=−∞∑∞h(k)x(n−k)
The mean of y ( n ) y(n) y(n) may be found by taking the expect value of both sides,
E { y ( n ) } = E { ∑ k = − ∞ ∞ h ( k ) x ( n − k ) } = ∑ k = − ∞ ∞ h ( k ) E { x ( n − k ) } = m x ∑ k = − ∞ ∞ h ( k ) = m x H ( e j 0 ) \begin{aligned} E\{y(n)\}&=E\{\sum_{k=-\infty}^\infty h(k)x(n-k)\}=\sum_{k=-\infty}^\infty h(k)E\{x(n-k)\}\\ &=m_x\sum_{k=-\infty}^\infty h(k)=m_xH(e^{j0}) \end{aligned} E{y(n)}=E{k=−∞∑∞h(k)x(n−k)}=k=−∞∑∞h(k)E{x(n−k)}=mxk=−∞∑∞h(k)=mxH(ej0)
Before compute the autocorrelation of y ( n ) y(n) y(n), we can first compute the cross-correlation between x ( n ) x(n) x(n) and y ( n ) y(n) y(n) (it can also be done directly)
r y x ( n + k , n ) = E { y ( n + k ) x ∗ ( n ) } = E { ∑ l = − ∞ ∞ h ( l ) x ( n + k − l ) x ∗ ( n ) } = ∑ l = − ∞ ∞ h ( l ) E { x ( n + k − l ) x ∗ ( n ) } = ∑ l = − ∞ ∞ h ( l ) r x ( k − l ) = r x ( k ) ∗ h ( k ) \begin{aligned} r_{yx}(n+k,n)&=E\{y(n+k)x^*(n)\}=E\{\sum_{l=-\infty}^\infty h(l)x(n+k-l)x^*(n)\}\\ &=\sum_{l=-\infty}^\infty h(l)E\{x(n+k-l)x^*(n)\}\\ &=\sum_{l=-\infty}^\infty h(l) r_x(k-l)=r_x(k)*h(k) \end{aligned} ryx(n+k,n)=E{y(n+k)x∗(n)}=E{l=−∞∑∞h(l)x(n+k−l)x∗(n)}=l=−∞∑∞h(l)E{x(n+k−l)x∗(n)}=l=−∞∑∞h(l)rx(k−l)=rx(k)∗h(k)
The autocorrelation of y ( n ) y(n) y(n) may now be determined as follows,
r y ( n + k , n ) = E { y ( n + k ) y ∗ ( n ) } = E { y ( n + k ) ∑ l = − ∞ ∞ x ∗ ( l ) h ∗ ( n − l ) } = ∑ l = − ∞ ∞ h ∗ ( n − l ) E { y ( n + k ) x ∗ ( l ) } = ∑ l = − ∞ ∞ h ∗ ( n − l ) r y x ( n + k − l ) = r y x ( k ) ∗ h ∗ ( − k ) \begin{aligned} r_{y}(n+k,n)&=E\{y(n+k)y^*(n)\}=E\{y(n+k)\sum_{l=-\infty}^\infty x^*(l)h^*(n-l)\}\\ &=\sum_{l=-\infty}^\infty h^*(n-l)E\{y(n+k)x^*(l)\}\\ &=\sum_{l=-\infty}^\infty h^*(n-l) r_{yx}(n+k-l)=r_{yx}(k)*h^*(-k) \end{aligned} ry(n+k,n)=E{y(n+k)y∗(n)}=E{y(n+k)l=−∞∑∞x∗(l)h∗(n−l)}=l=−∞∑∞h∗(n−l)E{y(n+k)x∗(l)}=l=−∞∑∞h∗(n−l)ryx(n+k−l)=ryx(k)∗h∗(−k)
Therefore, autocorrelation sequence r y ( n + k , n ) r_y(n + k, n) ry(n+k,n) depends only on k k k, the difference between the indices n + k n + k n+k and n n n, i.e.,
r y ( k ) = r y x ( k ) ∗ h ∗ ( − k ) = r x ( k ) ∗ h ( k ) ∗ h ∗ ( − k ) = ∑ l = − ∞ ∞ ∑ m = − ∞ ∞ h ( l ) r x ( m − l + k ) h ∗ ( m ) r_y(k)=r_{yx}(k)*h^*(-k)=r_x(k)*h(k)*h^*(-k)=\sum_{l=-\infty}^\infty \sum_{m=-\infty}^\infty h(l)r_x(m-l+k)h^*(m) ry(k)=ryx(k)∗h∗(−k)=rx(k)∗h(k)∗h∗(−k)=l=−∞∑∞m=−∞∑∞h(l)rx(m−l+k)h∗(m)
Another interpretation can be obtained by defining
r h ( k ) = h ( k ) ∗ h ∗ ( − k ) = ∑ n = − ∞ ∞ h ( n ) h ∗ ( n + k ) r_h(k)=h(k)*h^*(-k)=\sum_{n=-\infty}^{\infty}h(n)h^*(n+k) rh(k)=h(k)∗h∗(−k)=n=−∞∑∞h(n)h∗(n+k)
Then
r y ( k ) = r x ( k ) ∗ r h ( k ) r_y(k)=r_x(k)*r_h(k) ry(k)=rx(k)∗rh(k)

The power of y ( n ) y(n) y(n) is given by
E { ∣ y ( n ) ∣ 2 } = r y ( 0 ) = ∑ l = − ∞ ∞ ∑ m = − ∞ ∞ h ( l ) r x ( m − l ) h ∗ ( m ) = h H R x h E\{|y(n)|^2\}=r_y(0)=\sum_{l=-\infty}^\infty \sum_{m=-\infty}^\infty h(l)r_x(m-l)h^*(m)=\mathbf h^H \mathbf R_x \mathbf h E{∣y(n)∣2}=ry(0)=l=−∞∑∞m=−∞∑∞h(l)rx(m−l)h∗(m)=hHRxh
where we assume h ( n ) h(n) h(n) zero outside [ 0 , N − 1 ] [0,N-1] [0,N−1] and h = [ h ( 0 ) , h ( 1 ) , ⋯ , h ( N − 1 ) ] T \mathbf h=[h(0),h(1),\cdots,h(N-1)]^T h=[h(0),h(1),⋯,h(N−1)]T
In terms of the power spectrum,
P y ( e j ω ) = P x ( e j ω ) ∣ H ( e j ω ) ∣ 2 P y ( z ) = P x ( z ) H ( z ) H ∗ ( 1 / z ∗ ) \begin{aligned} P_y(e^{j\omega})&=P_x(e^{j\omega})|H(e^{j\omega)}|^2\\ P_y(z)&=P_x(z)H(z)H^*(1/z^*) \end{aligned} Py(ejω)Py(z)=Px(ejω)∣H(ejω)∣2=Px(z)H(z)H∗(1/z∗)
So assuming no pole/zero cancelations between P x ( z ) P_x(z) Px(z) and H ( z ) H(z) H(z), if H ( z ) H(z) H(z) has a pole (zero) at z = z 0 z=z_0 z=z0, then P y ( z ) P_y(z) Py(z) also has a pole (zero) at z = z 0 z=z_0 z=z0 and another at the conjugate reciprocal location z = 1 / z 0 ∗ z=1/z_0^* z=1/z0∗.
If H ( e j ω ) H(e^{j\omega}) H(ejω) is a narrow-band bandpass filter with center frequency ω 0 \omega_0 ω0, bandwidth Δ ω \Delta \omega Δω, and magnitude 1 1 1, then the output power is
E { ∣ y ( n ) ∣ 2 } = r y ( 0 ) = 1 2 π ∫ − ∞ ∞ ∣ H ( e j ω ) ∣ 2 P x ( e j ω ) d ω ≈ Δ ω 2 π P x ( e j ω 0 ) E\{|y(n)|^2\}=r_y(0)=\frac{1}{2\pi}\int_{-\infty}^\infty |H(e^{j\omega})|^2P_x(e^{j\omega})d\omega \approx \frac{\Delta \omega}{2\pi}P_x(e^{j\omega_0}) E{∣y(n)∣2}=ry(0)=2π1∫−∞∞∣H(ejω)∣2Px(ejω)dω≈2πΔωPx(ejω0)
een P x ( z ) P_x(z) Px(z) and H ( z ) H(z) H(z), if H ( z ) H(z) H(z) has a pole (zero) at z = z 0 z=z_0 z=z0, then P y ( z ) P_y(z) Py(z) also has a pole (zero) at z = z 0 z=z_0 z=z0 and another at the conjugate reciprocal location z = 1 / z 0 ∗ z=1/z_0^* z=1/z0∗.
If H ( e j ω ) H(e^{j\omega}) H(ejω) is a narrow-band bandpass filter with center frequency ω 0 \omega_0 ω0, bandwidth Δ ω \Delta \omega Δω, and magnitude 1 1 1, then the output power is
E { ∣ y ( n ) ∣ 2 } = r y ( 0 ) = 1 2 π ∫ − ∞ ∞ ∣ H ( e j ω ) ∣ 2 P x ( e j ω ) d ω ≈ Δ ω 2 π P x ( e j ω 0 ) E\{|y(n)|^2\}=r_y(0)=\frac{1}{2\pi}\int_{-\infty}^\infty |H(e^{j\omega})|^2P_x(e^{j\omega})d\omega \approx \frac{\Delta \omega}{2\pi}P_x(e^{j\omega_0}) E{∣y(n)∣2}=ry(0)=2π1∫−∞∞∣H(ejω)∣2Px(ejω)dω≈2πΔωPx(ejω0)
Since E { ∣ y ( n ) ∣ 2 } E\{|y(n)|^2\} E{∣y(n)∣2} represents the power of x ( n ) x(n) x(n) within Δ ω \Delta \omega Δω, the power spectrum P x ( e j ω ) P_x(e^{j\omega}) Px(ejω) may be viewed as a density function that describes how the power in x ( n ) x(n) x(n) varies with ω \omega ω, i.e., it describes how the power is distributed over frequency ω \omega ω.

Discrete-Time Random Processes

Content

Random Variables

Definitions

Linear Mean-Square Estimation

Random Process

Definition

Stationarity

Autocorrelation and autocovariance matrices

Ergodicity

White noise

Power spectrum

Filtering Random Processes

相关推荐