Discrete-Time Random Processes
Reference:
Slides of EE4C03, TUD
Hayes M H. Statistical digital signal processing and modeling[M]. John Wiley & Sons, 2009.
Content
Random Variables
Definitions
A random variable x x x is a function that assigns a number to each outcome of a random experiment.
-
Probability distribution function
F x ( α ) = Pr ( x ≤ α ) F_x(\alpha)=\Pr(x\le \alpha) Fx(α)=Pr(x≤α) -
Probability density function
f x ( α ) = d d α F x ( α ) f_x(\alpha)=\frac{d}{d\alpha}F_x(\alpha) fx(α)=dαdFx(α) -
Mean or expected value
m x = E { x } = ∫ − ∞ ∞ α f x ( α ) d α m_x=E\{x\}=\int_{-\infty}^\infty \alpha f_x(\alpha)d\alpha mx=E{x}=∫−∞∞αfx(α)dα -
Variance
σ x 2 = E { ( x − m x ) 2 } = ∫ − ∞ ∞ ( α − m x ) 2 f x ( α ) d α = E { x 2 } − m x 2 \sigma_x^2=E\{(x-m_x)^2\}=\int_{-\infty}^\infty(\alpha-m_x)^2f_x(\alpha)d\alpha=E\{x^2\}-m_x^2 σx2=E{(x−mx)2}=∫−∞∞(α−mx)2fx(α)dα=E{x2}−mx2 -
Joint probability distribution function
F x , y ( α , β ) = Pr { x ≤ α , y ≤ β } F_{x,y}(\alpha,\beta)=\Pr\{x\le \alpha,y\le \beta\} Fx,y(α,β)=Pr{x≤α,y≤β} -
Joint probability density function
f x , y ( α , β ) = ∂ 2 ∂ α ∂ β F x , y ( α , β ) f_{x,y}(\alpha,\beta)=\frac{\partial ^2}{\partial \alpha\partial \beta}F_{x,y}(\alpha,\beta) fx,y(α,β)=∂α∂β∂2Fx,y(α,β) -
Correlation
r x y = E { x y ∗ } r_{xy}=E\{xy^*\} rxy=E{xy∗} -
Covariance
c x y = C o v ( x , y ) = E { ( x − m x ) ( y − m y ) ∗ } = r x , y − m x m y ∗ c_{xy}=\mathrm{Cov}(x,y)=E\{(x-m_x)(y-m_y)^*\}=r_{x,y}-m_xm_y^* cxy=Cov(x,y)=E{(x−mx)(y−my)∗}=rx,y−mxmy∗ -
Correlation coefficient
ρ x , y = c x , y σ x σ y = r x , y − m x m y ∗ σ x σ y , ∣ ρ x y ∣ ≤ 1 \rho_{x,y}=\frac{c_{x,y}}{\sigma_x\sigma_y}=\frac{r_{x,y}-m_xm_y^*}{\sigma_x\sigma_y},\qquad |\rho_{xy}|\le 1 ρx,y=σxσycx,y=σxσyrx,y−mxmy∗,∣ρxy∣≤1
Proof: Define an inner product on the set of random variables
⟨ x , y ⟩ : = E { x y ∗ } \langle x,y\rangle:=E\{xy^*\} ⟨x,y⟩:=E{xy∗}
From Cauchy–Schwarz inequality, we have
∣ ⟨ x − m x , y − m y ⟩ ∣ 2 ≤ ⟨ x − m x , x − m x ⟩ ⋅ ⟨ y − m y , y − m y ⟩ , |\langle x-m_x,y-m_y\rangle|^2\le\langle x-m_x,x-m_x\rangle\cdot\langle y-m_y,y-m_y\rangle, ∣⟨x−mx,y−my⟩∣2≤⟨x−mx,x−mx⟩⋅⟨y−my,y−my⟩,
i.e. c x , y ≤ σ x σ y c_{x,y}\le \sigma_x \sigma_y cx,y≤σxσy.The ρ x , y \rho_{x,y} ρx,y here is very similar to the cos θ \cos \theta cosθ when we do vector product.
-
Two random variables x x x and y y y are independent if
f x , y ( α , β ) = f x ( α ) f y ( β ) f_{x,y}(\alpha,\beta)=f_x(\alpha)f_y(\beta) fx,y(α,β)=fx(α)fy(β) -
Two random variables x x x and y y y are uncorrelated if
E { x y ∗ } = E { x } E { y ∗ } or r x y = m x m y ∗ or c x y = 0 E\{xy^*\}=E\{x\}E\{y^*\}\text{ or }r_{xy}=m_xm_y^*\text{ or }c_{xy}=0 E{xy∗}=E{x}E{y∗} or rxy=mxmy∗ or cxy=0 -
Two random variables x x x and y y y are orthogonal if
E { x y ∗ } = 0 or r x y = 0 E\{xy^*\}=0\text{ or }r_{xy}=0 E{xy∗}=0 or rxy=0
Orthogonal random variables are not necessarily uncorrelatedBut orthogonal ⟺ \iff ⟺ uncorrelated if m x = m y = 0 m_x=m_y=0 mx=my=0
Linear Mean-Square Estimation
In mean-square estimation, an estimate
y
^
\hat y
y^ is to be found that minimizes the mean-square error
ξ
=
E
{
(
y
−
y
^
)
2
}
\xi=E\left\{(y-\hat y)^2\right\}
ξ=E{(y−y^)2}
Although the solution to this problem generally leads to a nonlinear estimator, in many cases a linear estimator is preferred. In linear mean-square estimation, the estimator is constrained to be of the form
y
^
=
a
x
+
b
\hat y=ax+b
y^=ax+b
and the goal is to find the values for
a
a
a and
b
b
b that minimize the mean-square error
ξ
=
E
{
(
y
−
a
x
−
b
)
2
}
\xi=E\left\{(y-ax-b)^2\right\}
ξ=E{(y−ax−b)2}
Solving the linear mean-square estimation problem may be accomplished by differentiating t with respect to
a
a
a and
b
b
b and setting the derivatives equal to zero as follows,
∂
ξ
∂
a
=
−
2
E
{
(
y
−
a
x
−
b
)
x
}
=
0
∂
ξ
∂
b
=
−
E
{
y
−
a
x
−
b
}
=
0
(*)
\begin{aligned} &\frac{\partial \xi}{\partial a}=-2E\{(y-ax-b)x\}=0\tag{*}\\ &\frac{\partial \xi}{\partial b}=-E\{y-ax-b\}=0 \end{aligned}
∂a∂ξ=−2E{(y−ax−b)x}=0∂b∂ξ=−E{y−ax−b}=0(*)
Before solving these equations for
a
a
a and
b
b
b, note that (*) says that
E
{
(
y
−
y
^
)
x
}
=
E
{
e
x
}
=
0
E\{(y-\hat y)x\}=E\{ex\}=0
E{(y−y^)x}=E{ex}=0
where
e
=
y
−
y
^
e=y-\hat y
e=y−y^ is the estimation error. This relationship, known as the orthogonality principle, states that for the optimum linear predictor the estimation error will be orthogonal to the data
x
x
x.
Solving equations above for
a
a
a and
b
b
b we find
a
=
E
{
x
y
}
−
m
x
m
y
σ
x
2
=
c
x
y
σ
x
2
=
ρ
x
y
σ
y
σ
x
,
b
=
E
{
x
2
}
m
y
−
E
{
x
y
}
m
x
σ
x
2
=
m
y
−
a
m
x
.
\begin{aligned} a=&\frac{E\{xy\}-m_xm_y}{\sigma_x^2}=\frac{c_{xy}}{\sigma_x^2}=\rho_{xy} \frac{\sigma _y}{\sigma_x},\\ b=&\frac{E\{x^2\}m_y-E\{xy\}m_x}{\sigma_x^2}=m_y-am_x. \end{aligned}
a=b=σx2E{xy}−mxmy=σx2cxy=ρxyσxσy,σx2E{x2}my−E{xy}mx=my−amx.
Therefore, the estimator
y
^
=
ρ
x
y
σ
y
σ
x
(
x
−
m
x
)
+
m
y
,
\hat y=\rho_{xy}\frac{\sigma _y}{\sigma_x}(x-m_x)+m_y,
y^=ρxyσxσy(x−mx)+my,
and the minimum mean-square error can easily be calculated
ξ
m
i
n
=
σ
y
2
(
1
−
ρ
x
y
2
)
\xi_{min}=\sigma_y^2(1-\rho_{xy}^2)
ξmin=σy2(1−ρxy2)
We see that the correlation coefficient provides a measure of the linear predictability between random variables. The closer ∣ ρ x y ∣ |\rho_{xy}| ∣ρxy∣ is to 1 1 1, the smaller the mean-square error in the estimation of y y y using a linear estimator.
Random Process
Definition
A random process x ( n ) x(n) x(n) is an indexed sequence of random variables (a “signal”)
-
Mean and variance
m x ( n ) = E { x ( n ) } σ x 2 ( n ) = E { ∣ x ( n ) − m x ( n ) ∣ 2 } m_x(n)=E\{x(n)\}\qquad \sigma_x^2(n)=E\{|x(n)-m_x(n)|^2\} mx(n)=E{x(n)}σx2(n)=E{∣x(n)−mx(n)∣2} -
Autocorrelation and autocovariance
r x ( k , l ) = E { x ( k ) x ∗ ( l ) } c x ( k , l ) = E { [ x ( k ) − m x ( k ) ] [ x ( l ) − m x ( l ) ] ∗ } = r x ( k , l ) − m x ( k ) m x ∗ ( l ) \begin{aligned} &r_x(k,l)=E\{x(k)x^*(l)\}\\ &c_x(k,l)=E\{[x(k)-m_x(k)][x(l)-m_x(l)]^*\}=r_x(k,l)-m_x(k)m_x^*(l) \end{aligned} rx(k,l)=E{x(k)x∗(l)}cx(k,l)=E{[x(k)−mx(k)][x(l)−mx(l)]∗}=rx(k,l)−mx(k)mx∗(l) -
Cross-correlation and cross-covariance
r x y ( k , l ) = E { x ( k ) y ∗ ( l ) } c x y ( k , l ) = E { [ x ( k ) − m x ( k ) ] [ y ( l ) − m y ( l ) ] ∗ } = r x y ( k , l ) − m x ( k ) m y ∗ ( l ) \begin{aligned} &r_{xy}(k,l)=E\{x(k)y^*(l)\}\\ &c_{xy}(k,l)=E\{[x(k)-m_x(k)][y(l)-m_y(l)]^*\}=r_{xy}(k,l)-m_x(k)m_y^*(l) \end{aligned} rxy(k,l)=E{x(k)y∗(l)}cxy(k,l)=E{[x(k)−mx(k)][y(l)−my(l)]∗}=rxy(k,l)−mx(k)my∗(l)- two random process x ( n ) x(n) x(n) and y ( n ) y(n) y(n) are said to be uncorrelated if c x y ( k , l ) = 0 c_{xy}(k,l)=0 cxy(k,l)=0
- two random process x ( n ) x(n) x(n) and y ( n ) y(n) y(n) are said to be orthogonal if r x y ( k , l ) = 0 r_{xy}(k,l)=0 rxy(k,l)=0
- zero mean processes: uncorrelated ⟺ \iff ⟺ orthogonal
Property: If two random process
x
(
n
)
x(n)
x(n) and
y
(
n
)
y(n)
y(n) are uncorrelated, then the autocorrelation of the sum
z
(
n
)
=
x
(
n
)
+
y
(
n
)
z(n)=x(n)+y(n)
z(n)=x(n)+y(n)
is equal to the sum of the autocorrelation of
x
(
n
)
x(n)
x(n) and
y
(
n
)
y(n)
y(n),
r
z
(
k
,
l
)
=
r
x
(
k
,
l
)
+
r
y
(
k
,
l
)
r_z(k,l)=r_x(k,l)+r_y(k,l)
rz(k,l)=rx(k,l)+ry(k,l)
Stationarity
-
First-order stationary if
f x ( n ) ( α ) = f x ( n + k ) ( α ) . f_{x(n)}(\alpha)=f_{x(n+k)}(\alpha). fx(n)(α)=fx(n+k)(α).
Implies m x ( n ) = m x ( 0 ) : = m x m_x(n)=m_x(0):=m_x mx(n)=mx(0):=mx. -
Second-order stationary if
f x ( n 1 ) , x ( n 2 ) ( α 1 , α 2 ) = f x ( n 1 + k ) , x ( n 2 + k ) ( α 1 , α 2 ) . f_{x(n_1),x(n_2)}(\alpha_1,\alpha_2)=f_{x(n_1+k),x(n_2+k)}(\alpha_1,\alpha_2). fx(n1),x(n2)(α1,α2)=fx(n1+k),x(n2+k)(α1,α2).
Implies r x ( k , l ) = r x ( k − l , 0 ) : = r x ( k − l ) r_x(k,l)=r_x(k-l,0):=r_x(k-l) rx(k,l)=rx(k−l,0):=rx(k−l). -
Stationarity in the strict sense, if the process is stationary for all orders L > 0 L>0 L>0
-
Wide-sense stationarity (WSS) if
- m x ( n ) = m x m_x(n)=m_x mx(n)=mx
- r x ( k , l ) = r x ( k − l ) r_x(k,l)=r_x(k-l) rx(k,l)=rx(k−l)
- c x ( 0 ) < ∞ c_x(0)<\infty cx(0)<∞
-
Joint wide-sense stationarity if x ( n ) x(n) x(n) and y ( n ) y(n) y(n) are wide-sense stationary and if the cross-correlation r x y ( k , l ) r_{xy}(k,l) rxy(k,l) depends only on the difference k − l k-l k−l
r x y ( k , l ) = r x y ( k − l , 0 ) : = r x y ( k − l ) r_{xy}(k,l)=r_{xy}(k-l,0):=r_{xy}(k-l) rxy(k,l)=rxy(k−l,0):=rxy(k−l) -
Properties of WSS processes
symmetry r x ( k ) = r x ∗ ( − k ) r_x(k)=r_x^*(-k) rx(k)=rx∗(−k) mean-square value $r_x(0)=E{ maximum value $r_x(0)\ge mean-square periodicity r x ( k 0 ) = r x ( 0 ) ⟺ r x ( k ) periodic with period k 0 r_x(k_0)=r_x(0)\iff r_x(k)\text{ periodic with period }k_0 rx(k0)=rx(0)⟺rx(k) periodic with period k0 Proof of r x ( 0 ) ≥ ∣ r x ( k ) ∣ r_x(0)\ge |r_x(k)| rx(0)≥∣rx(k)∣:
Using Cauchy–Schwarz inequality,
∣ ⟨ x ( 0 ) , x ( k ) ⟩ ∣ ≤ ⟨ x ( 0 ) , x ( 0 ) ⟩ ⋅ ⟨ x ( k ) , x ( k ) ⟩ , |\langle x(0),x(k)\rangle|\le\sqrt{\langle x(0),x(0)\rangle\cdot \langle x(k),x(k)\rangle}, ∣⟨x(0),x(k)⟩∣≤⟨x(0),x(0)⟩⋅⟨x(k),x(k)⟩ ,
i.e.
∣ r x ( k ) ∣ ≤ r x ( 0 ) ⋅ r x ( 0 ) = r x ( 0 ) |r_x(k)|\le\sqrt{r_x(0)\cdot r_x(0)}=r_x(0) ∣rx(k)∣≤rx(0)⋅rx(0) =rx(0)
Autocorrelation and autocovariance matrices
We consider a WSS process
x
(
n
)
x(n)
x(n) and collect
p
+
1
p+1
p+1 samples in a vector
x
=
[
x
(
0
)
,
x
(
1
)
,
⋯
,
x
(
p
)
]
T
\mathbf{x}=[x(0),x(1),\cdots,x(p)]^T
x=[x(0),x(1),⋯,x(p)]T
-
Autocorrelation matrix
R x = E { x x H } = [ r x ( 0 ) r x ∗ ( 1 ) ⋯ r x ∗ ( p ) r x ( 1 ) r x ( 0 ) ⋯ r x ∗ ( p − 1 ) ⋮ ⋮ ⋮ r x ( p ) r x ( p − 1 ) ⋯ r x ( 0 ) ] \mathbf R_x=E\{\mathbf x \mathbf x^H\}=\left[\begin{matrix}r_x(0) & r_x^*(1) & \cdots &r_x^*(p)\\ r_x(1) & r_x(0) & \cdots &r_x^*(p-1)\\ \vdots & \vdots & &\vdots\\ r_x(p) & r_x(p-1) & \cdots &r_x(0) \end{matrix}\right] Rx=E{xxH}=⎣⎢⎢⎢⎡rx(0)rx(1)⋮rx(p)rx∗(1)rx(0)⋮rx(p−1)⋯⋯⋯rx∗(p)rx∗(p−1)⋮rx(0)⎦⎥⎥⎥⎤
Properties- The autocorrelation matrix of a WSS random process x ( n ) \mathbf x (n) x(n) is a Hermitian Toeplitz matrix, R x = T o e p { r x ( 0 ) , r x ( 1 ) , ⋯ , r x ( p ) } \mathbf R_x=\mathrm{Toep}\{r_x(0),r_x(1),\cdots,r_x(p)\} Rx=Toep{rx(0),rx(1),⋯,rx(p)}
- The autocorrelation matrix of a WSS random process is nonnegative definite, R x > 0 \mathbf R_x>0 Rx>0. ( a H R x a = E { ∣ a H x ∣ 2 } \mathbf a^H\mathbf R_x \mathbf a=E\{|\mathbf a^H \mathbf x|^2\} aHRxa=E{∣aHx∣2})
- The eigenvalues, λ k \lambda_k λk, of the autocorrelation matrix of a WSS random process are real-valued and nonnegative.
-
Autocovariance matrix
C x = E { ( x − m x ) ( x − m x ) H } = R x − m x m x H \mathbf C_x=E\{(\mathbf x-\mathbf m_x)(\mathbf x-\mathbf m_x)^H\}=\mathbf R_x-\mathbf m_x \mathbf m_x^H Cx=E{(x−mx)(x−mx)H}=Rx−mxmxH
where m x = [ m x , ⋯ , m x ] T \mathbf m_x=[m_x,\cdots,m_x]^T mx=[mx,⋯,mx]T
Ergodicity
When is the sample mean equal to the ensemble mean (expectation)?
-
Sample mean
m ^ x ⟨ N ⟩ = 1 N ∑ n = 0 N − 1 x ( n ) \hat m_x\langle N \rangle=\frac{1}{N}\sum_{n=0}^{N-1}x(n) m^x⟨N⟩=N1n=0∑N−1x(n) -
A WSS process is ergodic in the mean if (mean-square convergence)
lim N → ∞ E { ∣ m ^ x ( N ) − m x ∣ 2 } = 0 or lim N → ∞ m ^ x ( N ) = m x ⟺ E { m ^ x ( N ) } = m x (unbiased) and lim N → ∞ V a r { m ^ x ( N ) } = 0 \lim_{N\to \infty}E\{|\hat m_x(N)-m_x|^2\}=0\text{ or }\lim_{N\to \infty}\hat m_x(N)=m_x\\ \iff \\ E\{\hat m_x(N)\}=m_x\text{(unbiased) and }\lim_{N\to \infty}\mathrm{Var}\{\hat m_x(N)\}=0 N→∞limE{∣m^x(N)−mx∣2}=0 or N→∞limm^x(N)=mx⟺E{m^x(N)}=mx(unbiased) and N→∞limVar{m^x(N)}=0
From the definition of the sample mean it follows easily that the sample mean is unbiased for any wide-sense stationary process,
E { m ^ x ( N ) } = 1 N ∑ n = 0 N − 1 E { x ( n ) } = m x E\{\hat m_x(N)\}=\frac{1}{N}\sum_{n=0}^{N-1}E\{x(n)\}=m_x E{m^x(N)}=N1n=0∑N−1E{x(n)}=mx
In order for the variance to go to zero, however, some constraints must be placed on the process x ( n ) x(n) x(n).
V a r { m ^ x ( N ) } = E { ∣ m ^ x ( N ) − m x ∣ 2 } = E { ∣ 1 N ∑ n = 0 N − 1 [ x ( n ) − m x ] ∣ 2 } = 1 N 2 ∑ n = 0 N − 1 ∑ m = 0 N − 1 E { [ x ( m ) − m x ] [ x ( n ) − m x ] ∗ } = 1 N 2 ∑ n = 0 N − 1 ∑ m = 0 N − 1 c x ( m − n ) \begin{aligned} \mathrm{Var}\{\hat m_x(N)\}&=E\{|\hat m_x(N)-m_x|^2\}=E\left\{\left|\frac{1}{N}\sum_{n=0}^{N-1}[x(n)-m_x]\right|^2\right\}\\ &=\frac{1}{N^2}\sum_{n=0}^{N-1}\sum_{m=0}^{N-1}E\{[x(m)-m_x][x(n)-m_x]^*\}\\ &=\frac{1}{N^2}\sum_{n=0}^{N-1}\sum_{m=0}^{N-1}c_x(m-n) \end{aligned} Var{m^x(N)}=E{∣m^x(N)−mx∣2}=E⎩⎨⎧∣∣∣∣∣N1n=0∑N−1[x(n)−mx]∣∣∣∣∣2⎭⎬⎫=N21n=0∑N−1m=0∑N−1E{[x(m)−mx][x(n)−mx]∗}=N21n=0∑N−1m=0∑N−1cx(m−n)
where c x ( m − n ) c_x(m-n) cx(m−n) is the autocovariance of x ( n ) x(n) x(n). Grouping together common terms we may write the variance as
V a r { m ^ x ( N ) } = 1 N 2 ∑ n = 0 N − 1 ∑ m = 0 N − 1 c x ( m − n ) = 1 N 2 ∑ k = − N + 1 N − 1 ( N − ∣ k ∣ ) c x ( k ) = 1 N ∑ k = − N + 1 N − 1 ( 1 − ∣ k ∣ N ) c x ( k ) \begin{aligned} \mathrm{Var}\{\hat m_x(N)\} &=\frac{1}{N^2}\sum_{n=0}^{N-1}\sum_{m=0}^{N-1}c_x(m-n)=\frac{1}{N^2}\sum_{k=-N+1}^{N-1}(N-|k|)c_x(k)\\ &=\frac{1}{N}\sum_{k=-N+1}^{N-1}(1-\frac{|k|}{N})c_x(k) \end{aligned} Var{m^x(N)}=N21n=0∑N−1m=0∑N−1cx(m−n)=N21k=−N+1∑N−1(N−∣k∣)cx(k)=N1k=−N+1∑N−1(1−N∣k∣)cx(k)-
Mean Ergodic Theorem 1
Let x ( n ) x (n) x(n) be a WSS random process with autocovariance sequence c x ( k ) c_x(k) cx(k). A necessary and sufficient condition for x ( n ) x(n) x(n) to be ergodic in the mean is
lim N → ∞ 1 N ∑ k = − N + 1 N − 1 c x ( k ) = 0 \lim_{N\to\infty}\frac{1}{N}\sum_{k=-N+1}^{N-1}c_x(k)=0 N→∞limN1k=−N+1∑N−1cx(k)=0 -
Mean Ergodic Theorem 2
Let x ( n ) x (n) x(n) be a WSS random process with autocovariance sequence c x ( k ) c_x(k) cx(k). Sufficient conditions for x ( n ) x(n) x(n) to be ergodic in the mean are that c x ( 0 ) < ∞ c_x(0) < \infty cx(0)<∞ and
lim k → ∞ c x ( k ) = 0 \lim _{k\to\infty}c_x(k)=0 k→∞limcx(k)=0
In other words, a WSS process will be ergodic in the mean if it is asymptotically uncorrelated.
-
White noise
-
White noise is a discrete-time random process v ( n ) v(n) v(n) with autocovariance:
c v ( k ) = σ v 2 δ ( k ) c_v(k)=\sigma_v^2\delta(k) cv(k)=σv2δ(k)
i.e., a sequence of uncorrelated random variables, each having a variance of σ v 2 \sigma_v^2 σv2. -
Defined only in terms of the form of its second-order moment → \to → infinite variety of white noise random process: white Gaussian noise, Bernoulli process…
-
The power spectrum (defined later) of zero-mean white noise is constant:
P v ( e j ω ) = ∑ k = − ∞ ∞ r v ( k ) e − j k ω = σ v 2 P_v(e^{j\omega})=\sum_{k=-\infty}^\infty r_v(k)e^{-jk\omega}=\sigma_v^2 Pv(ejω)=k=−∞∑∞rv(k)e−jkω=σv2 -
For complex white noise v ( n ) = v 1 ( n ) + j v 2 ( n ) v(n)=v_1(n)+jv_2(n) v(n)=v1(n)+jv2(n),
E { ∣ v ( n ) ∣ 2 } = E { ∣ v 1 ( n ) ∣ 2 } + E { ∣ v 2 ( n ) ∣ 2 } , E\{|v(n)|^2\}=E\{|v_1(n)|^2\}+E\{|v_2(n)|^2\}, E{∣v(n)∣2}=E{∣v1(n)∣2}+E{∣v2(n)∣2},
i.e., the variance of v ( n ) v(n) v(n) is the sum of the variances of the real and imaginary components, v 1 ( n ) v_1(n) v1(n) and v 2 ( n ) v_2(n) v2(n), respectively.
Power spectrum
-
The power spectrum of a WSS process is the DTFT of the autocorrelation:
P x ( e j ω ) = ∑ k = − ∞ ∞ r x ( k ) e − j k ω . P_x(e^{j\omega})=\sum_{k=-\infty}^{\infty}r_x(k)e^{-jk\omega}. Px(ejω)=k=−∞∑∞rx(k)e−jkω.
The autocorrelation sequence may be determined by taking the inverse discrete-time Fourier transform of P x ( e j w ) P_x (e^{jw}) Px(ejw),
r x ( k ) = 1 2 π ∫ − π π P x ( e j k ω ) e j k ω d ω . r_x(k)=\frac{1}{2\pi}\int_{-\pi}^\pi P_x(e^{jk\omega})e^{jk\omega}d\omega. rx(k)=2π1∫−ππPx(ejkω)ejkωdω.
In some cases it may be more convenient to use the z-transform instead of the discrete-time Fourier transform:
P x ( z ) = ∑ k = − ∞ ∞ r x ( k ) z − k . P_x(z)=\sum_{k=-\infty}^{\infty}r_x(k)z^{-k}. Px(z)=k=−∞∑∞rx(k)z−k. -
Since the autocorrelation is conjugate symmetric, the power spectrum is real
P x ( z ) = P x ∗ ( 1 / z ∗ ) ⇒ P x ( e j ω ) = P x ∗ ( e j ω ) P_x(z)=P_x^*(1/z^*) \quad \Rightarrow \quad P_x(e^{j\omega})=P_x^*(e^{j\omega}) Px(z)=Px∗(1/z∗)⇒Px(ejω)=Px∗(ejω) -
If the stochastic process is real, the power spectrum is even
P x ( z ) = P x ∗ ( z ∗ ) ⇒ P x ( e j ω ) = P x ∗ ( e − j ω ) = P x ( e − j ω ) P_x(z)=P_x^*(z^*) \quad \Rightarrow \quad P_x(e^{j\omega})=P_x^*(e^{-j\omega})=P_x(e^{-j\omega}) Px(z)=Px∗(z∗)⇒Px(ejω)=Px∗(e−jω)=Px(e−jω) -
The power spectrum is nonnegative
P x ( e j ω ) ≥ 0 P_x(e^{j\omega})\ge 0 Px(ejω)≥0 -
The total power is proportional to the area under the power spectrum
E { ∣ x ( n ) ∣ 2 } = r x ( 0 ) = 1 2 π ∫ − π π P x ( e j ω ) d ω E\{|x(n)|^2\}=r_x(0)=\frac{1}{2\pi}\int_{-\pi}^\pi P_x(e^{j\omega}) d\omega E{∣x(n)∣2}=rx(0)=2π1∫−ππPx(ejω)dω -
The eigenvalues λ i \lambda_i λi of the n × n n\times n n×n autocorrelation matrix are upper and lower bounded by the maximum and minimum value, respectively, of the power spectrum
min ω P x ( e j ω ) ≤ λ i ≤ max ω P x ( e j ω ) \min_\omega P_x(e^{j\omega})\le \lambda_i\le\max_\omega P_x(e^{j\omega}) ωminPx(ejω)≤λi≤ωmaxPx(ejω)
Proof: Let λ i \lambda_i λi and q i \mathbf q_i qi be the eigenvalues and eigenvectors, respectively, of the n × n n \times n n×n autocorrelation matrix R x \mathbf R_x Rx,
R x q i = λ i q i ; i = 1 , 2 , ⋯ , n \mathbf R_x \mathbf q_i=\lambda_i \mathbf q_i;\quad i=1,2,\cdots,n Rxqi=λiqi;i=1,2,⋯,n
Since q i H R x q i = λ i q i H q i \mathbf q_i^H\mathbf R_x \mathbf q_i=\lambda_i \mathbf q_i^H\mathbf q_i qiHRxqi=λiqiHqi,
λ i = q i H R x q i q i H q i . \lambda_i=\frac{\mathbf q_i^H\mathbf R_x \mathbf q_i}{ \mathbf q_i^H\mathbf q_i}. λi=qiHqiqiHRxqi.
Expanding the Hermitian form in the numerator, we have
q i H R x q i = ∑ k = 0 n − 1 ∑ l = 0 n − 1 q i ∗ ( k ) r x ( k − l ) q i ( l ) = 1 2 π ∑ k = 0 n − 1 ∑ l = 0 n − 1 q i ∗ ( k ) q i ( l ) ∫ − π π P x ( e j ω ) e j ω ( k − l ) d ω = 1 2 π ∫ − π π [ ∑ k = 0 n − 1 q i ∗ ( k ) e j ω k ] [ ∑ l = 0 n − 1 q i ( l ) e − j ω l ] P x ( e j ω ) d ω \begin{aligned} \mathbf q_i^H\mathbf R_x \mathbf q_i&=\sum_{k=0}^{n-1}\sum_{l=0}^{n-1}q_i^*(k)r_x(k-l)q_i(l)\\ &=\frac{1}{2\pi}\sum_{k=0}^{n-1}\sum_{l=0}^{n-1}q_i^*(k)q_i(l)\int_{-\pi}^\pi P_x(e^{j\omega})e^{j\omega(k-l)}d\omega\\ &=\frac{1}{2\pi}\int_{-\pi}^\pi[\sum_{k=0}^{n-1}q_i^*(k)e^{j\omega k}][\sum_{l=0}^{n-1}q_i(l) e^{-j\omega l}]P_x(e^{j\omega})d\omega \end{aligned} qiHRxqi=k=0∑n−1l=0∑n−1qi∗(k)rx(k−l)qi(l)=2π1k=0∑n−1l=0∑n−1qi∗(k)qi(l)∫−ππPx(ejω)ejω(k−l)dω=2π1∫−ππ[k=0∑n−1qi∗(k)ejωk][l=0∑n−1qi(l)e−jωl]Px(ejω)dω
With Q i ( e j ω ) = ∑ k = 0 n − 1 q i ( k ) e − j k ω Q_i(e^{j\omega})=\sum_{k=0}^{n-1}q_i(k)e^{-jk\omega} Qi(ejω)=∑k=0n−1qi(k)e−jkω,
q i H R x q i = 1 2 π ∫ − π π P x ( e j ω ) ∣ Q i ( e j ω ) ∣ 2 d ω \mathbf q_i^H\mathbf R_x \mathbf q_i=\frac{1}{2\pi}\int_{-\pi}^\pi P_x(e^{j\omega})|Q_i(e^{j\omega})|^2d\omega qiHRxqi=2π1∫−ππPx(ejω)∣Qi(ejω)∣2dω
Repeating these steps we find
q i H q i = 1 2 π ∫ − π π ∣ Q i ( e j ω ) ∣ 2 d ω \mathbf q_i^H \mathbf q_i=\frac{1}{2\pi}\int_{-\pi}^\pi |Q_i(e^{j\omega})|^2d\omega qiHqi=2π1∫−ππ∣Qi(ejω)∣2dω
Therefore,
min ω P x ( e j ω ) ≤ λ i = q i H R x q i q i H q i = 1 2 π ∫ − π π P x ( e j ω ) ∣ Q i ( e j ω ) ∣ 2 d ω 1 2 π ∫ − π π ∣ Q i ( e j ω ) ∣ 2 d ω ≤ max ω P x ( e j ω ) \min_\omega P_x(e^{j\omega})\le \lambda_i=\frac{\mathbf q_i^H\mathbf R_x \mathbf q_i}{ \mathbf q_i^H\mathbf q_i}=\frac{\frac{1}{2\pi}\int_{-\pi}^\pi P_x(e^{j\omega})|Q_i(e^{j\omega})|^2d\omega}{\frac{1}{2\pi}\int_{-\pi}^\pi |Q_i(e^{j\omega})|^2d\omega}\le\max_\omega P_x(e^{j\omega}) ωminPx(ejω)≤λi=qiHqiqiHRxqi=2π1∫−ππ∣Qi(ejω)∣2dω2π1∫−ππPx(ejω)∣Qi(ejω)∣2dω≤ωmaxPx(ejω) -
The power spectrum is related to the mean of ∣ X ( e j ω ) ∣ 2 |X(e^{j\omega})|^2 ∣X(ejω)∣2 as
P x ( e j ω ) = lim N → ∞ 1 2 N + 1 E { ∣ ∑ n = − N N x ( n ) e − j ω n ∣ 2 } P_x(e^{j\omega})=\lim_{N\to\infty}\frac{1}{2N+1}E\left\{\left|\sum_{n=-N}^N x(n)e^{-j\omega n}\right|^2\right\} Px(ejω)=N→∞lim2N+11E⎩⎨⎧∣∣∣∣∣n=−N∑Nx(n)e−jωn∣∣∣∣∣2⎭⎬⎫
It can be interpreted as the ensemble average energy (right) asymptotically approaches the sample average energy (left)Proof: By denoting the ensemble average of the squared Fourier magnitude (energy) P N ( e j ω ) P_N(e^{j\omega}) PN(ejω)
P N ( e j ω ) ≜ 1 2 N + 1 ∣ ∑ n = − N N x ( n ) e − j ω n ∣ 2 , P_N(e^{j\omega})\triangleq \frac{1}{2N+1}\left|\sum_{n=-N}^N x(n)e^{-j\omega n}\right|^2, PN(ejω)≜2N+11∣∣∣∣∣n=−N∑Nx(n)e−jωn∣∣∣∣∣2,
the original equation becomes
P x ( e j ω ) = lim N → ∞ E { P N ( e j ω ) } . P_x(e^{j\omega})=\lim_{N\to \infty}E\{P_N(e^{j\omega})\}. Px(ejω)=N→∞limE{PN(ejω)}.
Since
E { P N ( e j ω ) } = 1 2 N + 1 E { ( ∑ n = − N N x ( n ) e − j ω n ) ( ∑ m = − N N x ( m ) e − j ω m ) ∗ } = 1 2 N + 1 E { ∑ n = − N N ∑ m = − N N x ( n ) x ∗ ( m ) e − j ω ( n − m ) } = 1 2 N + 1 ∑ n = − N N ∑ m = − N N r x ( n − m ) e − j ω ( n − m ) = 1 2 N + 1 ∑ k = − 2 N 2 N ( 2 N + 1 − ∣ k ∣ ) r x ( k ) e − j k ω \begin{aligned} E\{P_N(e^{j\omega})\}&=\frac{1}{2N+1}E\left\{\left(\sum_{n=-N}^N x(n)e^{-j\omega n}\right)\left(\sum_{m=-N}^N x(m)e^{-j\omega m}\right)^*\right\}\\ &=\frac{1}{2N+1}E\left\{\sum_{n=-N}^N \sum_{m=-N}^N x(n)x^*(m)e^{-j\omega (n-m)}\right\}\\ &=\frac{1}{2N+1}\sum_{n=-N}^N \sum_{m=-N}^N r_x(n-m)e^{-j\omega (n-m)}\\ &=\frac{1}{2N+1}\sum_{k=-2N}^{2N}(2N+1-|k|)r_x(k)e^{-jk\omega} \end{aligned} E{PN(ejω)}=2N+11E{(n=−N∑Nx(n)e−jωn)(m=−N∑Nx(m)e−jωm)∗}=2N+11E{n=−N∑Nm=−N∑Nx(n)x∗(m)e−jω(n−m)}=2N+11n=−N∑Nm=−N∑Nrx(n−m)e−jω(n−m)=2N+11k=−2N∑2N(2N+1−∣k∣)rx(k)e−jkω
Assuming that the autocorrelation sequence decays to zero fast enough so that ∑ k = − ∞ ∞ ∣ k ∣ r x ( k ) < ∞ \sum_{k=-\infty}^\infty|k|r_x(k)<\infty ∑k=−∞∞∣k∣rx(k)<∞, letting N → ∞ N\to \infty N→∞ it follows that
lim N → ∞ E { P N ( e j ω ) } = ∑ k = − ∞ ∞ r x ( k ) e − j k ω = P x ( e j ω ) \lim_{N\to \infty}E\{P_N(e^{j\omega})\}=\sum_{k=-\infty}^\infty r_x(k)e^{-jk\omega}=P_x(e^{j\omega}) N→∞limE{PN(ejω)}=k=−∞∑∞rx(k)e−jkω=Px(ejω) -
If x ( n ) x(n) x(n) has a nonzero mean or a periodicity, the power spectrum contains impulses
Filtering Random Processes
-
the relationship between the mean and autocorrelation of the input process to that of the output process
Suppose x ( n ) x(n) x(n) is a WSS process with mean m x m_x mx and correlation r x ( k ) r_x(k) rx(k) that is filtered by a stable LSI filter with unit sample response h ( n ) h(n) h(n); then the output y ( n ) y(n) y(n) is also WSS with
m y = m x H ( e j 0 ) r y ( k ) = r x ( k ) ∗ h ( k ) ∗ h ∗ ( − k ) ≜ r x y ( k ) ∗ h ∗ ( − k ) ≜ r x ( k ) ∗ r h ( k ) \begin{aligned} m_y&=m_xH(e^{j0})\\ r_y(k)&=r_x(k)*h(k)*h^*(-k)\\ &\triangleq r_{xy}(k)*h^*(-k)\\ &\triangleq r_x(k)*r_h(k) \end{aligned} myry(k)=mxH(ej0)=rx(k)∗h(k)∗h∗(−k)≜rxy(k)∗h∗(−k)≜rx(k)∗rh(k)
Proof: The output
y
(
n
)
y(n)
y(n) is related to
x
(
n
)
x(n)
x(n) by the convolution sum
y
(
n
)
=
x
(
n
)
∗
h
(
n
)
=
∑
k
=
−
∞
∞
h
(
k
)
x
(
n
−
k
)
y(n)=x(n)*h(n)=\sum_{k=-\infty}^\infty h(k)x(n-k)
y(n)=x(n)∗h(n)=k=−∞∑∞h(k)x(n−k)
The mean of
y
(
n
)
y(n)
y(n) may be found by taking the expect value of both sides,
E
{
y
(
n
)
}
=
E
{
∑
k
=
−
∞
∞
h
(
k
)
x
(
n
−
k
)
}
=
∑
k
=
−
∞
∞
h
(
k
)
E
{
x
(
n
−
k
)
}
=
m
x
∑
k
=
−
∞
∞
h
(
k
)
=
m
x
H
(
e
j
0
)
\begin{aligned} E\{y(n)\}&=E\{\sum_{k=-\infty}^\infty h(k)x(n-k)\}=\sum_{k=-\infty}^\infty h(k)E\{x(n-k)\}\\ &=m_x\sum_{k=-\infty}^\infty h(k)=m_xH(e^{j0}) \end{aligned}
E{y(n)}=E{k=−∞∑∞h(k)x(n−k)}=k=−∞∑∞h(k)E{x(n−k)}=mxk=−∞∑∞h(k)=mxH(ej0)
Before compute the autocorrelation of
y
(
n
)
y(n)
y(n), we can first compute the cross-correlation between
x
(
n
)
x(n)
x(n) and
y
(
n
)
y(n)
y(n) (it can also be done directly)
r
y
x
(
n
+
k
,
n
)
=
E
{
y
(
n
+
k
)
x
∗
(
n
)
}
=
E
{
∑
l
=
−
∞
∞
h
(
l
)
x
(
n
+
k
−
l
)
x
∗
(
n
)
}
=
∑
l
=
−
∞
∞
h
(
l
)
E
{
x
(
n
+
k
−
l
)
x
∗
(
n
)
}
=
∑
l
=
−
∞
∞
h
(
l
)
r
x
(
k
−
l
)
=
r
x
(
k
)
∗
h
(
k
)
\begin{aligned} r_{yx}(n+k,n)&=E\{y(n+k)x^*(n)\}=E\{\sum_{l=-\infty}^\infty h(l)x(n+k-l)x^*(n)\}\\ &=\sum_{l=-\infty}^\infty h(l)E\{x(n+k-l)x^*(n)\}\\ &=\sum_{l=-\infty}^\infty h(l) r_x(k-l)=r_x(k)*h(k) \end{aligned}
ryx(n+k,n)=E{y(n+k)x∗(n)}=E{l=−∞∑∞h(l)x(n+k−l)x∗(n)}=l=−∞∑∞h(l)E{x(n+k−l)x∗(n)}=l=−∞∑∞h(l)rx(k−l)=rx(k)∗h(k)
The autocorrelation of
y
(
n
)
y(n)
y(n) may now be determined as follows,
r
y
(
n
+
k
,
n
)
=
E
{
y
(
n
+
k
)
y
∗
(
n
)
}
=
E
{
y
(
n
+
k
)
∑
l
=
−
∞
∞
x
∗
(
l
)
h
∗
(
n
−
l
)
}
=
∑
l
=
−
∞
∞
h
∗
(
n
−
l
)
E
{
y
(
n
+
k
)
x
∗
(
l
)
}
=
∑
l
=
−
∞
∞
h
∗
(
n
−
l
)
r
y
x
(
n
+
k
−
l
)
=
r
y
x
(
k
)
∗
h
∗
(
−
k
)
\begin{aligned} r_{y}(n+k,n)&=E\{y(n+k)y^*(n)\}=E\{y(n+k)\sum_{l=-\infty}^\infty x^*(l)h^*(n-l)\}\\ &=\sum_{l=-\infty}^\infty h^*(n-l)E\{y(n+k)x^*(l)\}\\ &=\sum_{l=-\infty}^\infty h^*(n-l) r_{yx}(n+k-l)=r_{yx}(k)*h^*(-k) \end{aligned}
ry(n+k,n)=E{y(n+k)y∗(n)}=E{y(n+k)l=−∞∑∞x∗(l)h∗(n−l)}=l=−∞∑∞h∗(n−l)E{y(n+k)x∗(l)}=l=−∞∑∞h∗(n−l)ryx(n+k−l)=ryx(k)∗h∗(−k)
Therefore, autocorrelation sequence
r
y
(
n
+
k
,
n
)
r_y(n + k, n)
ry(n+k,n) depends only on
k
k
k, the difference between the indices
n
+
k
n + k
n+k and
n
n
n, i.e.,
r
y
(
k
)
=
r
y
x
(
k
)
∗
h
∗
(
−
k
)
=
r
x
(
k
)
∗
h
(
k
)
∗
h
∗
(
−
k
)
=
∑
l
=
−
∞
∞
∑
m
=
−
∞
∞
h
(
l
)
r
x
(
m
−
l
+
k
)
h
∗
(
m
)
r_y(k)=r_{yx}(k)*h^*(-k)=r_x(k)*h(k)*h^*(-k)=\sum_{l=-\infty}^\infty \sum_{m=-\infty}^\infty h(l)r_x(m-l+k)h^*(m)
ry(k)=ryx(k)∗h∗(−k)=rx(k)∗h(k)∗h∗(−k)=l=−∞∑∞m=−∞∑∞h(l)rx(m−l+k)h∗(m)
Another interpretation can be obtained by defining
r
h
(
k
)
=
h
(
k
)
∗
h
∗
(
−
k
)
=
∑
n
=
−
∞
∞
h
(
n
)
h
∗
(
n
+
k
)
r_h(k)=h(k)*h^*(-k)=\sum_{n=-\infty}^{\infty}h(n)h^*(n+k)
rh(k)=h(k)∗h∗(−k)=n=−∞∑∞h(n)h∗(n+k)
Then
r
y
(
k
)
=
r
x
(
k
)
∗
r
h
(
k
)
r_y(k)=r_x(k)*r_h(k)
ry(k)=rx(k)∗rh(k)
-
The power of y ( n ) y(n) y(n) is given by
E { ∣ y ( n ) ∣ 2 } = r y ( 0 ) = ∑ l = − ∞ ∞ ∑ m = − ∞ ∞ h ( l ) r x ( m − l ) h ∗ ( m ) = h H R x h E\{|y(n)|^2\}=r_y(0)=\sum_{l=-\infty}^\infty \sum_{m=-\infty}^\infty h(l)r_x(m-l)h^*(m)=\mathbf h^H \mathbf R_x \mathbf h E{∣y(n)∣2}=ry(0)=l=−∞∑∞m=−∞∑∞h(l)rx(m−l)h∗(m)=hHRxh
where we assume h ( n ) h(n) h(n) zero outside [ 0 , N − 1 ] [0,N-1] [0,N−1] and h = [ h ( 0 ) , h ( 1 ) , ⋯ , h ( N − 1 ) ] T \mathbf h=[h(0),h(1),\cdots,h(N-1)]^T h=[h(0),h(1),⋯,h(N−1)]T -
In terms of the power spectrum,
P y ( e j ω ) = P x ( e j ω ) ∣ H ( e j ω ) ∣ 2 P y ( z ) = P x ( z ) H ( z ) H ∗ ( 1 / z ∗ ) \begin{aligned} P_y(e^{j\omega})&=P_x(e^{j\omega})|H(e^{j\omega)}|^2\\ P_y(z)&=P_x(z)H(z)H^*(1/z^*) \end{aligned} Py(ejω)Py(z)=Px(ejω)∣H(ejω)∣2=Px(z)H(z)H∗(1/z∗)
So assuming no pole/zero cancelations between P x ( z ) P_x(z) Px(z) and H ( z ) H(z) H(z), if H ( z ) H(z) H(z) has a pole (zero) at z = z 0 z=z_0 z=z0, then P y ( z ) P_y(z) Py(z) also has a pole (zero) at z = z 0 z=z_0 z=z0 and another at the conjugate reciprocal location z = 1 / z 0 ∗ z=1/z_0^* z=1/z0∗. -
If H ( e j ω ) H(e^{j\omega}) H(ejω) is a narrow-band bandpass filter with center frequency ω 0 \omega_0 ω0, bandwidth Δ ω \Delta \omega Δω, and magnitude 1 1 1, then the output power is
E { ∣ y ( n ) ∣ 2 } = r y ( 0 ) = 1 2 π ∫ − ∞ ∞ ∣ H ( e j ω ) ∣ 2 P x ( e j ω ) d ω ≈ Δ ω 2 π P x ( e j ω 0 ) E\{|y(n)|^2\}=r_y(0)=\frac{1}{2\pi}\int_{-\infty}^\infty |H(e^{j\omega})|^2P_x(e^{j\omega})d\omega \approx \frac{\Delta \omega}{2\pi}P_x(e^{j\omega_0}) E{∣y(n)∣2}=ry(0)=2π1∫−∞∞∣H(ejω)∣2Px(ejω)dω≈2πΔωPx(ejω0)
een P x ( z ) P_x(z) Px(z) and H ( z ) H(z) H(z), if H ( z ) H(z) H(z) has a pole (zero) at z = z 0 z=z_0 z=z0, then P y ( z ) P_y(z) Py(z) also has a pole (zero) at z = z 0 z=z_0 z=z0 and another at the conjugate reciprocal location z = 1 / z 0 ∗ z=1/z_0^* z=1/z0∗. -
If H ( e j ω ) H(e^{j\omega}) H(ejω) is a narrow-band bandpass filter with center frequency ω 0 \omega_0 ω0, bandwidth Δ ω \Delta \omega Δω, and magnitude 1 1 1, then the output power is
E { ∣ y ( n ) ∣ 2 } = r y ( 0 ) = 1 2 π ∫ − ∞ ∞ ∣ H ( e j ω ) ∣ 2 P x ( e j ω ) d ω ≈ Δ ω 2 π P x ( e j ω 0 ) E\{|y(n)|^2\}=r_y(0)=\frac{1}{2\pi}\int_{-\infty}^\infty |H(e^{j\omega})|^2P_x(e^{j\omega})d\omega \approx \frac{\Delta \omega}{2\pi}P_x(e^{j\omega_0}) E{∣y(n)∣2}=ry(0)=2π1∫−∞∞∣H(ejω)∣2Px(ejω)dω≈2πΔωPx(ejω0)
Since E { ∣ y ( n ) ∣ 2 } E\{|y(n)|^2\} E{∣y(n)∣2} represents the power of x ( n ) x(n) x(n) within Δ ω \Delta \omega Δω, the power spectrum P x ( e j ω ) P_x(e^{j\omega}) Px(ejω) may be viewed as a density function that describes how the power in x ( n ) x(n) x(n) varies with ω \omega ω, i.e., it describes how the power is distributed over frequency ω \omega ω.