2 《概率机器人》《Probabilistic Robotics》贝叶斯滤波算法
0 数学基础知识
- 随机变量 X X X的值为 x t x_t xt: p ( X = x t ) p(X=x_t) p(X=xt) 简写为 p ( x t ) p(x_t) p(xt)
- 测量值
z
z
z:
z t 1 : t 2 = z t 1 , z t 1 + 1 , z t 1 + 2 , ⋯ , z t 2 z_{t_1:t_2} = z_{t_1}, z_{t_1+1}, z_{t_1+2}, \cdots, z_{t_2} zt1:t2=zt1,zt1+1,zt1+2,⋯,zt2 - 控制值
u
u
u:
u t 1 : t 2 = u t 1 , u t 1 + 1 , u t 1 + 2 , ⋯ , u t 2 u_{t_1:t_2} = u_{t_1}, u_{t_1+1}, u_{t_1+2}, \cdots, u_{t_2} ut1:t2=ut1,ut1+1,ut1+2,⋯,ut2 - 条件概率:
p ( x t ∣ z t ) = p ( z t ∣ x t ) p ( x t ) p ( z t ) ↓ 在条件中加入 z 1 : t − 1 , u 1 : t , 等式依然成立 p ( x t ∣ z t , z 1 : t − 1 , u 1 : t ) = p ( z t ∣ x t , z 1 : t − 1 , u 1 : t ) p ( x t ∣ z 1 : t − 1 , u 1 : t ) p ( z t ∣ z 1 : t − 1 , u 1 : t ) p(x_t|z_t) = \frac{p(z_t|x_t)p(x_t) }{p(z_t)}\\ \downarrow \text{在条件中加入}z_{1:t-1},u_{1:t},\text{等式依然成立}\\ p(x_t|z_t,z_{1:t-1},u_{1:t}) = \frac{p(z_t|x_t,z_{1:t-1},u_{1:t})p(x_t|z_{1:t-1},u_{1:t}) }{p(z_t|z_{1:t-1},u_{1:t})} p(xt∣zt)=p(zt)p(zt∣xt)p(xt)↓在条件中加入z1:t−1,u1:t,等式依然成立p(xt∣zt,z1:t−1,u1:t)=p(zt∣z1:t−1,u1:t)p(zt∣xt,z1:t−1,u1:t)p(xt∣z1:t−1,u1:t) - 观测值
z
t
z_t
zt与之前观测值
z
1
:
t
−
1
z_{1:t-1}
z1:t−1和控制量
u
1
:
t
u_{1:t}
u1:t无关(马尔可夫性)
p ( z t ∣ x t , z 1 : t − 1 , u 1 : t ) = p ( z t ∣ x t ) p(z_t|x_t,z_{1:t-1},u_{1:t}) = p(z_t|x_t) p(zt∣xt,z1:t−1,u1:t)=p(zt∣xt) - 状态
x
t
x_t
xt仅与前一时刻状态
x
t
−
1
x_{t-1}
xt−1和当前时刻控制量
u
t
u_t
ut有关
p ( x t ∣ x t − 1 , z 1 : t − 1 , u 1 : t ) = p ( x t ∣ x t − 1 , u t ) (状态转移概率) p(x_t|x_{t-1},z_{1:t-1},u_{1:t})=p(x_t|x_{t-1},u_t) \text{(状态转移概率)} p(xt∣xt−1,z1:t−1,u1:t)=p(xt∣xt−1,ut)(状态转移概率) - 条件概率中,如果有一个参数是
t
−
1
t-1
t−1时刻的,那么
t
t
t时刻的任意参数都可以忽略,因为将来的参数(
t
t
t时刻)不可能对现在的参数(
t
−
1
t-1
t−1时刻)产生影响,也就是说时间不可能倒流
p ( x t − 1 ∣ z 1 : t − 1 , u 1 : t ) = p ( x t − 1 ∣ z 1 : t − 1 , u 1 : t − 1 ) p(x_{t-1}|z_{1:t-1},u_{1:t}) = p(x_{t-1}|z_{1:t-1},u_{1:t-1}) p(xt−1∣z1:t−1,u1:t)=p(xt−1∣z1:t−1,u1:t−1)
1 贝叶斯滤波算法的公式推导
-
贝叶斯滤波算法是已知 b e l ( x t − 1 , u t , z t ) bel(x_{t-1},u_t,z_t) bel(xt−1,ut,zt)求 b e l ( x t ) bel(x_t) bel(xt)
b e l ( x t ) = p ( x t ∣ z 1 : t , u 1 : t ) b e l ˉ ( x t ) = p ( x t ∣ z 1 : t − 1 , u 1 : t ) bel(x_t) = p(x_t|z_{1:t},u_{1:t})\\ \bar{bel}(x_t) = p(x_t|z_{1:t-1},u_{1:t}) bel(xt)=p(xt∣z1:t,u1:t)belˉ(xt)=p(xt∣z1:t−1,u1:t) -
公式推导 1
b e l ( x t ) = p ( x t ∣ z 1 : t , u 1 : t ) = p ( x t ∣ z t , z 1 : t − 1 , u 1 : t ) = p ( z t ∣ x t , z 1 : t − 1 , u 1 : t ) p ( x t ∣ z 1 : t − 1 , u 1 : t ) p ( z t ∣ z 1 : t − 1 , u 1 : t ) ↓ 定义 η = p ( z t ∣ z 1 : t − 1 , u 1 : t ) − 1 = η p ( z t ∣ x t , z 1 : t − 1 , u 1 : t ) p ( x t ∣ z 1 : t − 1 , u 1 : t ) ↓ p ( z t ∣ x t , z 1 : t − 1 , u 1 : t ) = p ( z t ∣ x t ) = η p ( z t ∣ x t ) p ( x t ∣ z 1 : t − 1 , u 1 : t ) ↓ b e l ( x t ) = η p ( z t ∣ x t ) b e l ˉ ( x t ) ( η : 用于归一化 ) \begin{aligned} bel(x_t) &=p(x_t|z_{1:t},u_{1:t})\\ &= p(x_t|z_t,z_{1:t-1},u_{1:t})\\ &= \frac{p(z_t|x_t,z_{1:t-1},u_{1:t})p(x_t|z_{1:t-1},u_{1:t})}{p(z_t|z_{1:t-1},u_{1:t})}\\ &\downarrow \text{定义}\eta = p(z_t|z_{1:t-1},u_{1:t})^{-1}\\ &= \eta p(z_t|x_t,z_{1:t-1},u_{1:t})p(x_t|z_{1:t-1},u_{1:t})\\ &\downarrow p(z_t|x_t,z_{1:t-1},u_{1:t}) = p(z_t|x_t)\\ &= \eta p(z_t|x_t)p(x_t|z_{1:t-1},u_{1:t})\\ &\downarrow\\ bel(x_t) &= \eta p(z_t|x_t)\bar{bel}(x_t)~~~(\eta:\text{用于归一化}) \end{aligned} bel(xt)bel(xt)=p(xt∣z1:t,u1:t)=p(xt∣zt,z1:t−1,u1:t)=p(zt∣z1:t−1,u1:t)p(zt∣xt,z1:t−1,u1:t)p(xt∣z1:t−1,u1:t)↓定义η=p(zt∣z1:t−1,u1:t)−1=ηp(zt∣xt,z1:t−1,u1:t)p(xt∣z1:t−1,u1:t)↓p(zt∣xt,z1:t−1,u1:t)=p(zt∣xt)=ηp(zt∣xt)p(xt∣z1:t−1,u1:t)↓=ηp(zt∣xt)belˉ(xt) (η:用于归一化) -
公式推导 2
p ( x t ) = ∫ p ( x t ∣ x t − 1 ) p ( x t − 1 ) d x t − 1 ↓ b e l ˉ ( x t ) = p ( x t ∣ z 1 : t − 1 , u 1 : t ) = ∫ p ( x t ∣ x t − 1 , z 1 : t − 1 , u 1 : t ) p ( x t − 1 ∣ z 1 : t − 1 , u 1 : t ) d x t − 1 ↓ p ( x t ∣ x t − 1 , z 1 : t − 1 , u 1 : t ) = p ( x t ∣ x t − 1 , u t ) ↓ p ( x t − 1 ∣ z 1 : t − 1 , u 1 : t ) = p ( x t − 1 ∣ z 1 : t − 1 , u 1 : t − 1 ) ↓ b e l ˉ ( x t ) = ∫ p ( x t ∣ x t − 1 , u t ) p ( x t − 1 ∣ z 1 : t − 1 , u 1 : t − 1 ) d x t − 1 = ∫ p ( x t ∣ x t − 1 , u t ) b e l ( x t − 1 ) d x t − 1 p(x_t) = \int p(x_t|x_{t-1})p(x_{t-1})dx_{t-1}\\ \downarrow\\ \begin{aligned} \bar{bel}(x_t) &= p(x_t|z_{1:t-1},u_{1:t})\\ &= \int p(x_t|x_{t-1},z_{1:t-1},u_{1:t})p(x_{t-1}|z_{1:t-1},u_{1:t})dx_{t-1}\\ &\downarrow p(x_t|x_{t-1},z_{1:t-1},u_{1:t})=p(x_t|x_{t-1},u_t)\\ &\downarrow p(x_{t-1}|z_{1:t-1},u_{1:t}) = p(x_{t-1}|z_{1:t-1},u_{1:t-1})\\ &\downarrow\\ \bar{bel}(x_t) &= \int p(x_t|x_{t-1},u_t)p(x_{t-1}|z_{1:t-1},u_{1:t-1})dx_{t-1}\\ &= \int p(x_t|x_{t-1},u_t)bel(x_{t-1})dx_{t-1}\\ \end{aligned} p(xt)=∫p(xt∣xt−1)p(xt−1)dxt−1↓belˉ(xt)belˉ(xt)=p(xt∣z1:t−1,u1:t)=∫p(xt∣xt−1,z1:t−1,u1:t)p(xt−1∣z1:t−1,u1:t)dxt−1↓p(xt∣xt−1,z1:t−1,u1:t)=p(xt∣xt−1,ut)↓p(xt−1∣z1:t−1,u1:t)=p(xt−1∣z1:t−1,u1:t−1)↓=∫p(xt∣xt−1,ut)p(xt−1∣z1:t−1,u1:t−1)dxt−1=∫p(xt∣xt−1,ut)bel(xt−1)dxt−1 -
公式总结:
b e l ˉ ( x t ) = ∫ p ( x t ∣ x t − 1 , u t ) b e l ( x t − 1 ) d x t − 1 b e l ( x t ) = η p ( z t ∣ x t ) b e l ˉ ( x t ) ( η : 用于归一化 ) \bar{bel}(x_t) = \int p(x_t|x_{t-1},u_t)bel(x_{t-1})dx_{t-1}\\ bel(x_t) = \eta p(z_t|x_t)\bar{bel}(x_t)~~~(\eta:\text{用于归一化}) belˉ(xt)=∫p(xt∣xt−1,ut)bel(xt−1)dxt−1bel(xt)=ηp(zt∣xt)belˉ(xt) (η:用于归一化)
2 贝叶斯滤波算法的流程
- 需要3个概率分布:
- 初始置信度: b e l ( x 0 ) = p ( x 0 ) bel(x_0)=p(x_0) bel(x0)=p(x0)
- 测量概率: p ( z t ∣ x t ) p(z_t|x_t) p(zt∣xt)
- 状态转移概率: p ( x t ∣ u t , x t − 1 ) p(x_t|u_t,x_{t-1}) p(xt∣ut,xt−1)
- 流程:
3 实例
-
为了便于理解上述公式和算法,给出实例:移动机器人通过传感器判断一扇门的状态
-
前提:
- 门有开和关两种状态 X t X_t Xt: X t = is_open X_t=\text{is\_open} Xt=is_open或 X t = is_closed X_t=\text{is\_closed} Xt=is_closed
- 传感器可以检测到上述两种状态 Z t Z_t Zt: Z t = sense_open Z_t=\text{sense\_open} Zt=sense_open或 Z t = sense_closed Z_t=\text{sense\_closed} Zt=sense_closed
- 但由于传感器存在噪声,检测结果存在误差,这里用条件概率来表示:
p ( Z t = sense_open ∣ X t = is_open ) = 0.6 p ( Z t = sense_closed ∣ X t = is_open ) = 0.4 p ( Z t = sense_open ∣ X t = is_closed ) = 0.2 p ( Z t = sense_closed ∣ X t = is_closed ) = 0.8 p(Z_t=\text{sense\_open}|X_t=\text{is\_open}) = 0.6\\ p(Z_t=\text{sense\_closed}|X_t=\text{is\_open}) = 0.4\\ p(Z_t=\text{sense\_open}|X_t=\text{is\_closed}) = 0.2\\ p(Z_t=\text{sense\_closed}|X_t=\text{is\_closed}) = 0.8 p(Zt=sense_open∣Xt=is_open)=0.6p(Zt=sense_closed∣Xt=is_open)=0.4p(Zt=sense_open∣Xt=is_closed)=0.2p(Zt=sense_closed∣Xt=is_closed)=0.8
-
机器人可以通过动作改变外界环境(用手把门推开)
- 机器人有推和不推两个控制动作 U t U_t Ut: U t = push U_t = \text{push} Ut=push或 U t = do_nothing U_t = \text{do\_nothing} Ut=do_nothing
- 如果门是关闭状态,机器人有0.8的概率把门推开
- 机器人采取动作后的条件概率表示为:
p ( X t = is_open ∣ U t = push , X t − 1 = is_open ) = 1 p ( X t = is_closed ∣ U t = push , X t − 1 = is_open ) = 0 p ( X t = is_open ∣ U t = push , X t − 1 = is_closed ) = 0.8 p ( X t = is_closed ∣ U t = push , X t − 1 = is_closed ) = 0.2 p(X_t=\text{is\_open}|U_t = \text{push}, X_{t-1}=\text{is\_open}) = 1\\ p(X_t=\text{is\_closed}|U_t = \text{push}, X_{t-1}=\text{is\_open}) = 0\\ p(X_t=\text{is\_open}|U_t = \text{push}, X_{t-1}=\text{is\_closed}) = 0.8\\ p(X_t=\text{is\_closed}|U_t = \text{push}, X_{t-1}=\text{is\_closed}) = 0.2 p(Xt=is_open∣Ut=push,Xt−1=is_open)=1p(Xt=is_closed∣Ut=push,Xt−1=is_open)=0p(Xt=is_open∣Ut=push,Xt−1=is_closed)=0.8p(Xt=is_closed∣Ut=push,Xt−1=is_closed)=0.2
-
- 机器人不采取动作后的条件概率表示为:
p ( X t = is_open ∣ U t = do_nothing , X t − 1 = is_open ) = 1 p ( X t = is_closed ∣ U t = do_nothing , X t − 1 = is_open ) = 0 p ( X t = is_open ∣ U t = do_nothing , X t − 1 = is_closed ) = 0 p ( X t = is_closed ∣ U t = do_nothing , X t − 1 = is_closed ) = 1 p(X_t=\text{is\_open}|U_t = \text{do\_nothing}, X_{t-1}=\text{is\_open}) = 1\\ p(X_t=\text{is\_closed}|U_t = \text{do\_nothing}, X_{t-1}=\text{is\_open}) = 0\\ p(X_t=\text{is\_open}|U_t = \text{do\_nothing}, X_{t-1}=\text{is\_closed}) = 0\\ p(X_t=\text{is\_closed}|U_t = \text{do\_nothing}, X_{t-1}=\text{is\_closed}) = 1 p(Xt=is_open∣Ut=do_nothing,Xt−1=is_open)=1p(Xt=is_closed∣Ut=do_nothing,Xt−1=is_open)=0p(Xt=is_open∣Ut=do_nothing,Xt−1=is_closed)=0p(Xt=is_closed∣Ut=do_nothing,Xt−1=is_closed)=1
- 机器人不采取动作后的条件概率表示为:
-
计算 t = 1 t=1 t=1时刻的置信率 b e l ( x 1 ) bel(x_1) bel(x1):
- 初始时刻机器人对门的状态一无所知,则置信度各位
50
%
50\%
50%
b e l ( X 0 = is_open ) = 0.5 b e l ( X 0 = is_closed ) = 0.5 bel(X_0 = \text{is\_open}) = 0.5\\ bel(X_0 = \text{is\_closed}) = 0.5 bel(X0=is_open)=0.5bel(X0=is_closed)=0.5
- 初始时刻机器人对门的状态一无所知,则置信度各位
50
%
50\%
50%
-
- 算法第3行计算, (在
t
=
1
t=1
t=1时刻,假定机器人不采取任何动作
U
1
=
do_nothing
U_1 = \text{do\_nothing}
U1=do_nothing):
b e l ˉ ( x t ) = ∫ p ( x t ∣ x t − 1 , u t ) b e l ( x t − 1 ) d x t − 1 \bar{bel}(x_t) = \int p(x_t|x_{t-1},u_t)bel(x_{t-1})dx_{t-1} belˉ(xt)=∫p(xt∣xt−1,ut)bel(xt−1)dxt−1
- 算法第3行计算, (在
t
=
1
t=1
t=1时刻,假定机器人不采取任何动作
U
1
=
do_nothing
U_1 = \text{do\_nothing}
U1=do_nothing):
↓ \downarrow ↓
b e l ˉ ( x 1 ) = ∫ p ( x 1 ∣ x 0 , u 1 ) b e l ( x 0 ) d x 0 = ∑ x 0 p ( x 1 ∣ x 0 , u 1 ) b e l ( x 0 ) ↓ ( X 0 有 X 0 = is_open或is_closed两种状态 ) = p ( x 1 ∣ U 1 = do_nothing , X 0 = is_open ) b e l ( X 0 = is_open ) + p ( x 1 ∣ U 1 = do_nothing , X 0 = is_closed ) b e l ( X 0 = is_closed ) ↓ b e l ˉ ( X 1 = is_open ) = p ( X 1 = is_open ∣ U 1 = do_nothing , X 0 = is_open ) b e l ( X 0 = is_open ) + p ( X 1 = is_open ∣ U 1 = do_nothing , X 0 = is_closed ) b e l ( X 0 = is_closed ) = 1 ⋅ 0.5 + 0 ⋅ 0.5 = 0.5 b e l ˉ ( X 1 = is_closed ) = p ( X 1 = is_closed ∣ U 1 = do_nothing , X 0 = is_open ) b e l ( X 0 = is_open ) + p ( X 1 = is_closed ∣ U 1 = do_nothing , X 0 = is_closed ) b e l ( X 0 = is_closed ) = 0 ⋅ 0.5 + 1 ⋅ 0.5 = 0.5 \begin{aligned} \bar{bel}(x_1) &= \int p(x_1|x_{0},u_1)bel(x_{0})dx_{0}\\ &= \sum_{x_0} p(x_1|x_{0},u_1)bel(x_{0})\\ &\downarrow ~~(X_0 \text{有} X_{0}=\text{is\_open}\text{或} \text{is\_closed}\text{两种状态})\\ &= p(x_1|U_1 = \text{do\_nothing}, X_{0}=\text{is\_open}) bel(X_0 = \text{is\_open})\\ &~~~+p(x_1|U_1 = \text{do\_nothing}, X_{0}=\text{is\_closed}) bel(X_0 = \text{is\_closed}) \end{aligned}\\ \downarrow\\ \begin{aligned} \bar{bel}(X_1=\text{is\_open}) &= p(X_1=\text{is\_open}|U_1 = \text{do\_nothing}, X_{0}=\text{is\_open}) bel(X_0 = \text{is\_open})\\ &~~~+p(X_1=\text{is\_open}|U_1 = \text{do\_nothing}, X_{0}=\text{is\_closed}) bel(X_0 = \text{is\_closed})\\ &= 1 \cdot 0.5 + 0 \cdot 0.5\\ &= 0.5 \end{aligned}\\ \begin{aligned} \bar{bel}(X_1=\text{is\_closed}) &= p(X_1=\text{is\_closed}|U_1 = \text{do\_nothing}, X_{0}=\text{is\_open}) bel(X_0 = \text{is\_open})\\ &~~~+p(X_1=\text{is\_closed}|U_1 = \text{do\_nothing}, X_{0}=\text{is\_closed}) bel(X_0 = \text{is\_closed})\\ &= 0 \cdot 0.5 + 1 \cdot 0.5\\ &= 0.5 \end{aligned} belˉ(x1)=∫p(x1∣x0,u1)bel(x0)dx0=x0∑p(x1∣x0,u1)bel(x0)↓ (X0有X0=is_open或is_closed两种状态)=p(x1∣U1=do_nothing,X0=is_open)bel(X0=is_open) +p(x1∣U1=do_nothing,X0=is_closed)bel(X0=is_closed)↓belˉ(X1=is_open)=p(X1=is_open∣U1=do_nothing,X0=is_open)bel(X0=is_open) +p(X1=is_open∣U1=do_nothing,X0=is_closed)bel(X0=is_closed)=1⋅0.5+0⋅0.5=0.5belˉ(X1=is_closed)=p(X1=is_closed∣U1=do_nothing,X0=is_open)bel(X0=is_open) +p(X1=is_closed∣U1=do_nothing,X0=is_closed)bel(X0=is_closed)=0⋅0.5+1⋅0.5=0.5
-
- 算法第4行计算:(假定在
t
=
1
t=1
t=1时刻,传感器检测为门打开
Z
1
=
sense_open
Z_1 = \text{sense\_open}
Z1=sense_open)
b e l ( x t ) = η p ( z t ∣ x t ) b e l ˉ ( x t ) ( η : 用于归一化 ) ↓ b e l ( x 1 ) = η p ( Z 1 = sense_open ∣ x 1 ) b e l ˉ ( x 1 ) ↓ b e l ( X 1 = is_open ) = η p ( Z 1 = sense_open ∣ X 1 = is_open ) b e l ˉ ( X 1 = is_open ) = η ⋅ 0.6 ⋅ 0.5 = 0.3 η b e l ( X 1 = is_closed ) = η p ( Z 1 = sense_open ∣ X 1 = is_closed ) b e l ˉ ( X 1 = is_closed ) = η ⋅ 0.2 ⋅ 0.5 = 0.1 η ↓ 归一化 η = ( 0.3 + 0.1 ) − 1 = 2.5 ↓ b e l ( X 1 = is_open ) = 0.3 η = 0.75 b e l ( X 1 = is_closed ) = 0.1 η = 0.25 bel(x_t) = \eta p(z_t|x_t)\bar{bel}(x_t)~~~(\eta:\text{用于归一化})\\ \downarrow\\ bel(x_1) = \eta p(Z_1 = \text{sense\_open}|x_1)\bar{bel}(x_1)\\ \downarrow\\ \begin{aligned} bel(X_1=\text{is\_open}) &= \eta p(Z_1 = \text{sense\_open}|X_1=\text{is\_open})\bar{bel}(X_1=\text{is\_open})\\ &=\eta \cdot 0.6 \cdot 0.5\\ &=0.3\eta \end{aligned}\\ \begin{aligned} bel(X_1=\text{is\_closed}) &= \eta p(Z_1 = \text{sense\_open}|X_1=\text{is\_closed})\bar{bel}(X_1=\text{is\_closed})\\ &=\eta \cdot 0.2 \cdot 0.5\\ &=0.1\eta \end{aligned}\\ \downarrow \text{归一化}\eta = (0.3+0.1)^{-1} = 2.5\\ \downarrow\\ bel(X_1=\text{is\_open}) = 0.3\eta = 0.75\\ bel(X_1=\text{is\_closed}) = 0.1\eta = 0.25 bel(xt)=ηp(zt∣xt)belˉ(xt) (η:用于归一化)↓bel(x1)=ηp(Z1=sense_open∣x1)belˉ(x1)↓bel(X1=is_open)=ηp(Z1=sense_open∣X1=is_open)belˉ(X1=is_open)=η⋅0.6⋅0.5=0.3ηbel(X1=is_closed)=ηp(Z1=sense_open∣X1=is_closed)belˉ(X1=is_closed)=η⋅0.2⋅0.5=0.1η↓归一化η=(0.3+0.1)−1=2.5↓bel(X1=is_open)=0.3η=0.75bel(X1=is_closed)=0.1η=0.25
- 算法第4行计算:(假定在
t
=
1
t=1
t=1时刻,传感器检测为门打开
Z
1
=
sense_open
Z_1 = \text{sense\_open}
Z1=sense_open)
- 同理,计算 t = 2 t=2 t=2时刻的置信率 b e l ( x 2 ) bel(x_2) bel(x2):
-
- 算法第3行计算, (在
t
=
2
t=2
t=2时刻,假定机器人采取动作
U
2
=
push
U_2 = \text{push}
U2=push):
b e l ˉ ( X 2 = is_open ) = p ( X 2 = is_open ∣ U 2 = push , X 1 = is_open ) b e l ( X 1 = is_open ) + p ( X 2 = is_open ∣ U 2 = push , X 1 = is_closed ) b e l ( X 1 = is_closed ) = 1 ⋅ 0.75 + 0.8 ⋅ 0.25 = 0.95 b e l ˉ ( X 2 = is_closed ) = p ( X 2 = is_closed ∣ U 2 = push , X 1 = is_open ) b e l ( X 1 = is_open ) + p ( X 2 = is_closed ∣ U 2 = push , X 1 = is_closed ) b e l ( X 1 = is_closed ) = 0 ⋅ 0.75 + 0.2 ⋅ 0.25 = 0.05 \begin{aligned} \bar{bel}(X_2=\text{is\_open}) &= p(X_2=\text{is\_open}|U_2 = \text{push}, X_{1}=\text{is\_open}) bel(X_1 = \text{is\_open})\\ &~~~+p(X_2=\text{is\_open}|U_2 = \text{push}, X_{1}=\text{is\_closed}) bel(X_1 = \text{is\_closed})\\ &= 1 \cdot 0.75 + 0.8 \cdot 0.25\\ &= 0.95 \end{aligned}\\ \begin{aligned} \bar{bel}(X_2=\text{is\_closed}) &= p(X_2=\text{is\_closed}|U_2 = \text{push}, X_{1}=\text{is\_open}) bel(X_1 = \text{is\_open})\\ &~~~+p(X_2=\text{is\_closed}|U_2 = \text{push}, X_{1}=\text{is\_closed}) bel(X_1 = \text{is\_closed})\\ &= 0 \cdot 0.75 + 0.2 \cdot 0.25\\ &= 0.05 \end{aligned} belˉ(X2=is_open)=p(X2=is_open∣U2=push,X1=is_open)bel(X1=is_open) +p(X2=is_open∣U2=push,X1=is_closed)bel(X1=is_closed)=1⋅0.75+0.8⋅0.25=0.95belˉ(X2=is_closed)=p(X2=is_closed∣U2=push,X1=is_open)bel(X1=is_open) +p(X2=is_closed∣U2=push,X1=is_closed)bel(X1=is_closed)=0⋅0.75+0.2⋅0.25=0.05
- 算法第3行计算, (在
t
=
2
t=2
t=2时刻,假定机器人采取动作
U
2
=
push
U_2 = \text{push}
U2=push):
-
- 算法第4行计算:(假定在
t
=
2
t=2
t=2时刻,传感器检测为门打开
Z
2
=
sense_open
Z_2 = \text{sense\_open}
Z2=sense_open)
b e l ( X 2 = is_open ) = η p ( Z 2 = sense_open ∣ X 2 = is_open ) b e l ˉ ( X 2 = is_open ) = η ⋅ 0.6 ⋅ 0.95 = 0.57 η b e l ( X 2 = is_closed ) = η p ( Z 2 = sense_open ∣ X 2 = is_closed ) b e l ˉ ( X 2 = is_closed ) = η ⋅ 0.2 ⋅ 0.05 = 0.01 η ↓ 归一化 η = ( 0.57 + 0.01 ) − 1 ↓ b e l ( X 1 = is_open ) ≈ 0.983 b e l ( X 1 = is_closed ) ≈ 0.017 \begin{aligned} bel(X_2=\text{is\_open}) &= \eta p(Z_2 = \text{sense\_open}|X_2=\text{is\_open})\bar{bel}(X_2=\text{is\_open})\\ &=\eta \cdot 0.6 \cdot 0.95\\ &=0.57\eta \end{aligned}\\ \begin{aligned} bel(X_2=\text{is\_closed}) &= \eta p(Z_2 = \text{sense\_open}|X_2=\text{is\_closed})\bar{bel}(X_2=\text{is\_closed})\\ &=\eta \cdot 0.2 \cdot 0.05\\ &=0.01\eta \end{aligned}\\ \downarrow \text{归一化}\eta = (0.57+0.01)^{-1}\\ \downarrow\\ bel(X_1=\text{is\_open}) \approx 0.983\\ bel(X_1=\text{is\_closed}) \approx 0.017 bel(X2=is_open)=ηp(Z2=sense_open∣X2=is_open)belˉ(X2=is_open)=η⋅0.6⋅0.95=0.57ηbel(X2=is_closed)=ηp(Z2=sense_open∣X2=is_closed)belˉ(X2=is_closed)=η⋅0.2⋅0.05=0.01η↓归一化η=(0.57+0.01)−1↓bel(X1=is_open)≈0.983bel(X1=is_closed)≈0.017
- 算法第4行计算:(假定在
t
=
2
t=2
t=2时刻,传感器检测为门打开
Z
2
=
sense_open
Z_2 = \text{sense\_open}
Z2=sense_open)
- 同理,计算 t = 3 , 4 , ⋯ t=3, 4, \cdots t=3,4,⋯时刻的置信率 b e l ( x t ) bel(x_t) bel(xt)…
4 分析
- 算法第3行叫做:控制更新或者预测,因为这步是在机器人采取控制动作的前提下计算
- 算法第4行叫做:测量更新,因为这步是在机器人根据传感器的测量信息完成计算
- η \eta η仅为归一化使用,在最后才计算