2 《概率机器人》《Probabilistic Robotics》贝叶斯滤波算法

0 数学基础知识

  • 随机变量 X X X的值为 x t x_t xt: p ( X = x t ) p(X=x_t) p(X=xt) 简写为 p ( x t ) p(x_t) p(xt)
  • 测量值 z z z
    z t 1 : t 2 = z t 1 , z t 1 + 1 , z t 1 + 2 , ⋯   , z t 2 z_{t_1:t_2} = z_{t_1}, z_{t_1+1}, z_{t_1+2}, \cdots, z_{t_2} zt1:t2=zt1,zt1+1,zt1+2,,zt2
  • 控制值 u u u
    u t 1 : t 2 = u t 1 , u t 1 + 1 , u t 1 + 2 , ⋯   , u t 2 u_{t_1:t_2} = u_{t_1}, u_{t_1+1}, u_{t_1+2}, \cdots, u_{t_2} ut1:t2=ut1,ut1+1,ut1+2,,ut2
  • 条件概率:
    p ( x t ∣ z t ) = p ( z t ∣ x t ) p ( x t ) p ( z t ) ↓ 在条件中加入 z 1 : t − 1 , u 1 : t , 等式依然成立 p ( x t ∣ z t , z 1 : t − 1 , u 1 : t ) = p ( z t ∣ x t , z 1 : t − 1 , u 1 : t ) p ( x t ∣ z 1 : t − 1 , u 1 : t ) p ( z t ∣ z 1 : t − 1 , u 1 : t ) p(x_t|z_t) = \frac{p(z_t|x_t)p(x_t) }{p(z_t)}\\ \downarrow \text{在条件中加入}z_{1:t-1},u_{1:t},\text{等式依然成立}\\ p(x_t|z_t,z_{1:t-1},u_{1:t}) = \frac{p(z_t|x_t,z_{1:t-1},u_{1:t})p(x_t|z_{1:t-1},u_{1:t}) }{p(z_t|z_{1:t-1},u_{1:t})} p(xtzt)=p(zt)p(ztxt)p(xt)在条件中加入z1:t1,u1:t,等式依然成立p(xtzt,z1:t1,u1:t)=p(ztz1:t1,u1:t)p(ztxt,z1:t1,u1:t)p(xtz1:t1,u1:t)
  • 观测值 z t z_t zt与之前观测值 z 1 : t − 1 z_{1:t-1} z1:t1和控制量 u 1 : t u_{1:t} u1:t无关(马尔可夫性)
    p ( z t ∣ x t , z 1 : t − 1 , u 1 : t ) = p ( z t ∣ x t ) p(z_t|x_t,z_{1:t-1},u_{1:t}) = p(z_t|x_t) p(ztxt,z1:t1,u1:t)=p(ztxt)
  • 状态 x t x_t xt仅与前一时刻状态 x t − 1 x_{t-1} xt1和当前时刻控制量 u t u_t ut有关
    p ( x t ∣ x t − 1 , z 1 : t − 1 , u 1 : t ) = p ( x t ∣ x t − 1 , u t ) (状态转移概率) p(x_t|x_{t-1},z_{1:t-1},u_{1:t})=p(x_t|x_{t-1},u_t) \text{(状态转移概率)} p(xtxt1,z1:t1,u1:t)=p(xtxt1,ut)(状态转移概率)
  • 条件概率中,如果有一个参数是 t − 1 t-1 t1时刻的,那么 t t t时刻的任意参数都可以忽略,因为将来的参数( t t t时刻)不可能对现在的参数( t − 1 t-1 t1时刻)产生影响,也就是说时间不可能倒流
    p ( x t − 1 ∣ z 1 : t − 1 , u 1 : t ) = p ( x t − 1 ∣ z 1 : t − 1 , u 1 : t − 1 ) p(x_{t-1}|z_{1:t-1},u_{1:t}) = p(x_{t-1}|z_{1:t-1},u_{1:t-1}) p(xt1z1:t1,u1:t)=p(xt1z1:t1,u1:t1)

1 贝叶斯滤波算法的公式推导

  • 贝叶斯滤波算法是已知 b e l ( x t − 1 , u t , z t ) bel(x_{t-1},u_t,z_t) bel(xt1,ut,zt) b e l ( x t ) bel(x_t) bel(xt)
    b e l ( x t ) = p ( x t ∣ z 1 : t , u 1 : t ) b e l ˉ ( x t ) = p ( x t ∣ z 1 : t − 1 , u 1 : t ) bel(x_t) = p(x_t|z_{1:t},u_{1:t})\\ \bar{bel}(x_t) = p(x_t|z_{1:t-1},u_{1:t}) bel(xt)=p(xtz1:t,u1:t)belˉ(xt)=p(xtz1:t1,u1:t)

  • 公式推导 1
    b e l ( x t ) = p ( x t ∣ z 1 : t , u 1 : t ) = p ( x t ∣ z t , z 1 : t − 1 , u 1 : t ) = p ( z t ∣ x t , z 1 : t − 1 , u 1 : t ) p ( x t ∣ z 1 : t − 1 , u 1 : t ) p ( z t ∣ z 1 : t − 1 , u 1 : t ) ↓ 定义 η = p ( z t ∣ z 1 : t − 1 , u 1 : t ) − 1 = η p ( z t ∣ x t , z 1 : t − 1 , u 1 : t ) p ( x t ∣ z 1 : t − 1 , u 1 : t ) ↓ p ( z t ∣ x t , z 1 : t − 1 , u 1 : t ) = p ( z t ∣ x t ) = η p ( z t ∣ x t ) p ( x t ∣ z 1 : t − 1 , u 1 : t ) ↓ b e l ( x t ) = η p ( z t ∣ x t ) b e l ˉ ( x t )     ( η : 用于归一化 ) \begin{aligned} bel(x_t) &=p(x_t|z_{1:t},u_{1:t})\\ &= p(x_t|z_t,z_{1:t-1},u_{1:t})\\ &= \frac{p(z_t|x_t,z_{1:t-1},u_{1:t})p(x_t|z_{1:t-1},u_{1:t})}{p(z_t|z_{1:t-1},u_{1:t})}\\ &\downarrow \text{定义}\eta = p(z_t|z_{1:t-1},u_{1:t})^{-1}\\ &= \eta p(z_t|x_t,z_{1:t-1},u_{1:t})p(x_t|z_{1:t-1},u_{1:t})\\ &\downarrow p(z_t|x_t,z_{1:t-1},u_{1:t}) = p(z_t|x_t)\\ &= \eta p(z_t|x_t)p(x_t|z_{1:t-1},u_{1:t})\\ &\downarrow\\ bel(x_t) &= \eta p(z_t|x_t)\bar{bel}(x_t)~~~(\eta:\text{用于归一化}) \end{aligned} bel(xt)bel(xt)=p(xtz1:t,u1:t)=p(xtzt,z1:t1,u1:t)=p(ztz1:t1,u1:t)p(ztxt,z1:t1,u1:t)p(xtz1:t1,u1:t)定义η=p(ztz1:t1,u1:t)1=ηp(ztxt,z1:t1,u1:t)p(xtz1:t1,u1:t)p(ztxt,z1:t1,u1:t)=p(ztxt)=ηp(ztxt)p(xtz1:t1,u1:t)=ηp(ztxt)belˉ(xt)   (η:用于归一化)

  • 公式推导 2
    p ( x t ) = ∫ p ( x t ∣ x t − 1 ) p ( x t − 1 ) d x t − 1 ↓ b e l ˉ ( x t ) = p ( x t ∣ z 1 : t − 1 , u 1 : t ) = ∫ p ( x t ∣ x t − 1 , z 1 : t − 1 , u 1 : t ) p ( x t − 1 ∣ z 1 : t − 1 , u 1 : t ) d x t − 1 ↓ p ( x t ∣ x t − 1 , z 1 : t − 1 , u 1 : t ) = p ( x t ∣ x t − 1 , u t ) ↓ p ( x t − 1 ∣ z 1 : t − 1 , u 1 : t ) = p ( x t − 1 ∣ z 1 : t − 1 , u 1 : t − 1 ) ↓ b e l ˉ ( x t ) = ∫ p ( x t ∣ x t − 1 , u t ) p ( x t − 1 ∣ z 1 : t − 1 , u 1 : t − 1 ) d x t − 1 = ∫ p ( x t ∣ x t − 1 , u t ) b e l ( x t − 1 ) d x t − 1 p(x_t) = \int p(x_t|x_{t-1})p(x_{t-1})dx_{t-1}\\ \downarrow\\ \begin{aligned} \bar{bel}(x_t) &= p(x_t|z_{1:t-1},u_{1:t})\\ &= \int p(x_t|x_{t-1},z_{1:t-1},u_{1:t})p(x_{t-1}|z_{1:t-1},u_{1:t})dx_{t-1}\\ &\downarrow p(x_t|x_{t-1},z_{1:t-1},u_{1:t})=p(x_t|x_{t-1},u_t)\\ &\downarrow p(x_{t-1}|z_{1:t-1},u_{1:t}) = p(x_{t-1}|z_{1:t-1},u_{1:t-1})\\ &\downarrow\\ \bar{bel}(x_t) &= \int p(x_t|x_{t-1},u_t)p(x_{t-1}|z_{1:t-1},u_{1:t-1})dx_{t-1}\\ &= \int p(x_t|x_{t-1},u_t)bel(x_{t-1})dx_{t-1}\\ \end{aligned} p(xt)=p(xtxt1)p(xt1)dxt1belˉ(xt)belˉ(xt)=p(xtz1:t1,u1:t)=p(xtxt1,z1:t1,u1:t)p(xt1z1:t1,u1:t)dxt1p(xtxt1,z1:t1,u1:t)=p(xtxt1,ut)p(xt1z1:t1,u1:t)=p(xt1z1:t1,u1:t1)=p(xtxt1,ut)p(xt1z1:t1,u1:t1)dxt1=p(xtxt1,ut)bel(xt1)dxt1

  • 公式总结:
    b e l ˉ ( x t ) = ∫ p ( x t ∣ x t − 1 , u t ) b e l ( x t − 1 ) d x t − 1 b e l ( x t ) = η p ( z t ∣ x t ) b e l ˉ ( x t )     ( η : 用于归一化 ) \bar{bel}(x_t) = \int p(x_t|x_{t-1},u_t)bel(x_{t-1})dx_{t-1}\\ bel(x_t) = \eta p(z_t|x_t)\bar{bel}(x_t)~~~(\eta:\text{用于归一化}) belˉ(xt)=p(xtxt1,ut)bel(xt1)dxt1bel(xt)=ηp(ztxt)belˉ(xt)   (η:用于归一化)

2 贝叶斯滤波算法的流程

  • 需要3个概率分布:
    • 初始置信度: b e l ( x 0 ) = p ( x 0 ) bel(x_0)=p(x_0) bel(x0)=p(x0)
    • 测量概率: p ( z t ∣ x t ) p(z_t|x_t) p(ztxt)
    • 状态转移概率: p ( x t ∣ u t , x t − 1 ) p(x_t|u_t,x_{t-1}) p(xtut,xt1)
  • 流程:
    2 《概率机器人》《Probabilistic Robotics》贝叶斯滤波算法

3 实例

  • 为了便于理解上述公式和算法,给出实例:移动机器人通过传感器判断一扇门的状态
    2 《概率机器人》《Probabilistic Robotics》贝叶斯滤波算法

  • 前提:

    • 门有开和关两种状态 X t X_t Xt X t = is_open X_t=\text{is\_open} Xt=is_open X t = is_closed X_t=\text{is\_closed} Xt=is_closed
    • 传感器可以检测到上述两种状态 Z t Z_t Zt Z t = sense_open Z_t=\text{sense\_open} Zt=sense_open Z t = sense_closed Z_t=\text{sense\_closed} Zt=sense_closed
    • 但由于传感器存在噪声,检测结果存在误差,这里用条件概率来表示:
      p ( Z t = sense_open ∣ X t = is_open ) = 0.6 p ( Z t = sense_closed ∣ X t = is_open ) = 0.4 p ( Z t = sense_open ∣ X t = is_closed ) = 0.2 p ( Z t = sense_closed ∣ X t = is_closed ) = 0.8 p(Z_t=\text{sense\_open}|X_t=\text{is\_open}) = 0.6\\ p(Z_t=\text{sense\_closed}|X_t=\text{is\_open}) = 0.4\\ p(Z_t=\text{sense\_open}|X_t=\text{is\_closed}) = 0.2\\ p(Z_t=\text{sense\_closed}|X_t=\text{is\_closed}) = 0.8 p(Zt=sense_openXt=is_open)=0.6p(Zt=sense_closedXt=is_open)=0.4p(Zt=sense_openXt=is_closed)=0.2p(Zt=sense_closedXt=is_closed)=0.8
  • 机器人可以通过动作改变外界环境(用手把门推开)

    • 机器人有推和不推两个控制动作 U t U_t Ut: U t = push U_t = \text{push} Ut=push U t = do_nothing U_t = \text{do\_nothing} Ut=do_nothing
    • 如果门是关闭状态,机器人有0.8的概率把门推开
    • 机器人采取动作后的条件概率表示为:
      p ( X t = is_open ∣ U t = push , X t − 1 = is_open ) = 1 p ( X t = is_closed ∣ U t = push , X t − 1 = is_open ) = 0 p ( X t = is_open ∣ U t = push , X t − 1 = is_closed ) = 0.8 p ( X t = is_closed ∣ U t = push , X t − 1 = is_closed ) = 0.2 p(X_t=\text{is\_open}|U_t = \text{push}, X_{t-1}=\text{is\_open}) = 1\\ p(X_t=\text{is\_closed}|U_t = \text{push}, X_{t-1}=\text{is\_open}) = 0\\ p(X_t=\text{is\_open}|U_t = \text{push}, X_{t-1}=\text{is\_closed}) = 0.8\\ p(X_t=\text{is\_closed}|U_t = \text{push}, X_{t-1}=\text{is\_closed}) = 0.2 p(Xt=is_openUt=push,Xt1=is_open)=1p(Xt=is_closedUt=push,Xt1=is_open)=0p(Xt=is_openUt=push,Xt1=is_closed)=0.8p(Xt=is_closedUt=push,Xt1=is_closed)=0.2
    • 机器人不采取动作后的条件概率表示为:
      p ( X t = is_open ∣ U t = do_nothing , X t − 1 = is_open ) = 1 p ( X t = is_closed ∣ U t = do_nothing , X t − 1 = is_open ) = 0 p ( X t = is_open ∣ U t = do_nothing , X t − 1 = is_closed ) = 0 p ( X t = is_closed ∣ U t = do_nothing , X t − 1 = is_closed ) = 1 p(X_t=\text{is\_open}|U_t = \text{do\_nothing}, X_{t-1}=\text{is\_open}) = 1\\ p(X_t=\text{is\_closed}|U_t = \text{do\_nothing}, X_{t-1}=\text{is\_open}) = 0\\ p(X_t=\text{is\_open}|U_t = \text{do\_nothing}, X_{t-1}=\text{is\_closed}) = 0\\ p(X_t=\text{is\_closed}|U_t = \text{do\_nothing}, X_{t-1}=\text{is\_closed}) = 1 p(Xt=is_openUt=do_nothing,Xt1=is_open)=1p(Xt=is_closedUt=do_nothing,Xt1=is_open)=0p(Xt=is_openUt=do_nothing,Xt1=is_closed)=0p(Xt=is_closedUt=do_nothing,Xt1=is_closed)=1
  • 计算 t = 1 t=1 t=1时刻的置信率 b e l ( x 1 ) bel(x_1) bel(x1):

    • 初始时刻机器人对门的状态一无所知,则置信度各位 50 % 50\% 50%
      b e l ( X 0 = is_open ) = 0.5 b e l ( X 0 = is_closed ) = 0.5 bel(X_0 = \text{is\_open}) = 0.5\\ bel(X_0 = \text{is\_closed}) = 0.5 bel(X0=is_open)=0.5bel(X0=is_closed)=0.5
    • 算法第3行计算, (在 t = 1 t=1 t=1时刻,假定机器人不采取任何动作 U 1 = do_nothing U_1 = \text{do\_nothing} U1=do_nothing):
      b e l ˉ ( x t ) = ∫ p ( x t ∣ x t − 1 , u t ) b e l ( x t − 1 ) d x t − 1 \bar{bel}(x_t) = \int p(x_t|x_{t-1},u_t)bel(x_{t-1})dx_{t-1} belˉ(xt)=p(xtxt1,ut)bel(xt1)dxt1

↓ \downarrow

b e l ˉ ( x 1 ) = ∫ p ( x 1 ∣ x 0 , u 1 ) b e l ( x 0 ) d x 0 = ∑ x 0 p ( x 1 ∣ x 0 , u 1 ) b e l ( x 0 ) ↓    ( X 0 有 X 0 = is_open或is_closed两种状态 ) = p ( x 1 ∣ U 1 = do_nothing , X 0 = is_open ) b e l ( X 0 = is_open )     + p ( x 1 ∣ U 1 = do_nothing , X 0 = is_closed ) b e l ( X 0 = is_closed ) ↓ b e l ˉ ( X 1 = is_open ) = p ( X 1 = is_open ∣ U 1 = do_nothing , X 0 = is_open ) b e l ( X 0 = is_open )     + p ( X 1 = is_open ∣ U 1 = do_nothing , X 0 = is_closed ) b e l ( X 0 = is_closed ) = 1 ⋅ 0.5 + 0 ⋅ 0.5 = 0.5 b e l ˉ ( X 1 = is_closed ) = p ( X 1 = is_closed ∣ U 1 = do_nothing , X 0 = is_open ) b e l ( X 0 = is_open )     + p ( X 1 = is_closed ∣ U 1 = do_nothing , X 0 = is_closed ) b e l ( X 0 = is_closed ) = 0 ⋅ 0.5 + 1 ⋅ 0.5 = 0.5 \begin{aligned} \bar{bel}(x_1) &= \int p(x_1|x_{0},u_1)bel(x_{0})dx_{0}\\ &= \sum_{x_0} p(x_1|x_{0},u_1)bel(x_{0})\\ &\downarrow ~~(X_0 \text{有} X_{0}=\text{is\_open}\text{或} \text{is\_closed}\text{两种状态})\\ &= p(x_1|U_1 = \text{do\_nothing}, X_{0}=\text{is\_open}) bel(X_0 = \text{is\_open})\\ &~~~+p(x_1|U_1 = \text{do\_nothing}, X_{0}=\text{is\_closed}) bel(X_0 = \text{is\_closed}) \end{aligned}\\ \downarrow\\ \begin{aligned} \bar{bel}(X_1=\text{is\_open}) &= p(X_1=\text{is\_open}|U_1 = \text{do\_nothing}, X_{0}=\text{is\_open}) bel(X_0 = \text{is\_open})\\ &~~~+p(X_1=\text{is\_open}|U_1 = \text{do\_nothing}, X_{0}=\text{is\_closed}) bel(X_0 = \text{is\_closed})\\ &= 1 \cdot 0.5 + 0 \cdot 0.5\\ &= 0.5 \end{aligned}\\ \begin{aligned} \bar{bel}(X_1=\text{is\_closed}) &= p(X_1=\text{is\_closed}|U_1 = \text{do\_nothing}, X_{0}=\text{is\_open}) bel(X_0 = \text{is\_open})\\ &~~~+p(X_1=\text{is\_closed}|U_1 = \text{do\_nothing}, X_{0}=\text{is\_closed}) bel(X_0 = \text{is\_closed})\\ &= 0 \cdot 0.5 + 1 \cdot 0.5\\ &= 0.5 \end{aligned} belˉ(x1)=p(x1x0,u1)bel(x0)dx0=x0p(x1x0,u1)bel(x0)  (X0X0=is_openis_closed两种状态)=p(x1U1=do_nothing,X0=is_open)bel(X0=is_open)   +p(x1U1=do_nothing,X0=is_closed)bel(X0=is_closed)belˉ(X1=is_open)=p(X1=is_openU1=do_nothing,X0=is_open)bel(X0=is_open)   +p(X1=is_openU1=do_nothing,X0=is_closed)bel(X0=is_closed)=10.5+00.5=0.5belˉ(X1=is_closed)=p(X1=is_closedU1=do_nothing,X0=is_open)bel(X0=is_open)   +p(X1=is_closedU1=do_nothing,X0=is_closed)bel(X0=is_closed)=00.5+10.5=0.5

    • 算法第4行计算:(假定在 t = 1 t=1 t=1时刻,传感器检测为门打开 Z 1 = sense_open Z_1 = \text{sense\_open} Z1=sense_open)
      b e l ( x t ) = η p ( z t ∣ x t ) b e l ˉ ( x t )     ( η : 用于归一化 ) ↓ b e l ( x 1 ) = η p ( Z 1 = sense_open ∣ x 1 ) b e l ˉ ( x 1 ) ↓ b e l ( X 1 = is_open ) = η p ( Z 1 = sense_open ∣ X 1 = is_open ) b e l ˉ ( X 1 = is_open ) = η ⋅ 0.6 ⋅ 0.5 = 0.3 η b e l ( X 1 = is_closed ) = η p ( Z 1 = sense_open ∣ X 1 = is_closed ) b e l ˉ ( X 1 = is_closed ) = η ⋅ 0.2 ⋅ 0.5 = 0.1 η ↓ 归一化 η = ( 0.3 + 0.1 ) − 1 = 2.5 ↓ b e l ( X 1 = is_open ) = 0.3 η = 0.75 b e l ( X 1 = is_closed ) = 0.1 η = 0.25 bel(x_t) = \eta p(z_t|x_t)\bar{bel}(x_t)~~~(\eta:\text{用于归一化})\\ \downarrow\\ bel(x_1) = \eta p(Z_1 = \text{sense\_open}|x_1)\bar{bel}(x_1)\\ \downarrow\\ \begin{aligned} bel(X_1=\text{is\_open}) &= \eta p(Z_1 = \text{sense\_open}|X_1=\text{is\_open})\bar{bel}(X_1=\text{is\_open})\\ &=\eta \cdot 0.6 \cdot 0.5\\ &=0.3\eta \end{aligned}\\ \begin{aligned} bel(X_1=\text{is\_closed}) &= \eta p(Z_1 = \text{sense\_open}|X_1=\text{is\_closed})\bar{bel}(X_1=\text{is\_closed})\\ &=\eta \cdot 0.2 \cdot 0.5\\ &=0.1\eta \end{aligned}\\ \downarrow \text{归一化}\eta = (0.3+0.1)^{-1} = 2.5\\ \downarrow\\ bel(X_1=\text{is\_open}) = 0.3\eta = 0.75\\ bel(X_1=\text{is\_closed}) = 0.1\eta = 0.25 bel(xt)=ηp(ztxt)belˉ(xt)   (η:用于归一化)bel(x1)=ηp(Z1=sense_openx1)belˉ(x1)bel(X1=is_open)=ηp(Z1=sense_openX1=is_open)belˉ(X1=is_open)=η0.60.5=0.3ηbel(X1=is_closed)=ηp(Z1=sense_openX1=is_closed)belˉ(X1=is_closed)=η0.20.5=0.1η归一化η=(0.3+0.1)1=2.5bel(X1=is_open)=0.3η=0.75bel(X1=is_closed)=0.1η=0.25
  • 同理,计算 t = 2 t=2 t=2时刻的置信率 b e l ( x 2 ) bel(x_2) bel(x2):
    • 算法第3行计算, (在 t = 2 t=2 t=2时刻,假定机器人采取动作 U 2 = push U_2 = \text{push} U2=push):
      b e l ˉ ( X 2 = is_open ) = p ( X 2 = is_open ∣ U 2 = push , X 1 = is_open ) b e l ( X 1 = is_open )     + p ( X 2 = is_open ∣ U 2 = push , X 1 = is_closed ) b e l ( X 1 = is_closed ) = 1 ⋅ 0.75 + 0.8 ⋅ 0.25 = 0.95 b e l ˉ ( X 2 = is_closed ) = p ( X 2 = is_closed ∣ U 2 = push , X 1 = is_open ) b e l ( X 1 = is_open )     + p ( X 2 = is_closed ∣ U 2 = push , X 1 = is_closed ) b e l ( X 1 = is_closed ) = 0 ⋅ 0.75 + 0.2 ⋅ 0.25 = 0.05 \begin{aligned} \bar{bel}(X_2=\text{is\_open}) &= p(X_2=\text{is\_open}|U_2 = \text{push}, X_{1}=\text{is\_open}) bel(X_1 = \text{is\_open})\\ &~~~+p(X_2=\text{is\_open}|U_2 = \text{push}, X_{1}=\text{is\_closed}) bel(X_1 = \text{is\_closed})\\ &= 1 \cdot 0.75 + 0.8 \cdot 0.25\\ &= 0.95 \end{aligned}\\ \begin{aligned} \bar{bel}(X_2=\text{is\_closed}) &= p(X_2=\text{is\_closed}|U_2 = \text{push}, X_{1}=\text{is\_open}) bel(X_1 = \text{is\_open})\\ &~~~+p(X_2=\text{is\_closed}|U_2 = \text{push}, X_{1}=\text{is\_closed}) bel(X_1 = \text{is\_closed})\\ &= 0 \cdot 0.75 + 0.2 \cdot 0.25\\ &= 0.05 \end{aligned} belˉ(X2=is_open)=p(X2=is_openU2=push,X1=is_open)bel(X1=is_open)   +p(X2=is_openU2=push,X1=is_closed)bel(X1=is_closed)=10.75+0.80.25=0.95belˉ(X2=is_closed)=p(X2=is_closedU2=push,X1=is_open)bel(X1=is_open)   +p(X2=is_closedU2=push,X1=is_closed)bel(X1=is_closed)=00.75+0.20.25=0.05
    • 算法第4行计算:(假定在 t = 2 t=2 t=2时刻,传感器检测为门打开 Z 2 = sense_open Z_2 = \text{sense\_open} Z2=sense_open)
      b e l ( X 2 = is_open ) = η p ( Z 2 = sense_open ∣ X 2 = is_open ) b e l ˉ ( X 2 = is_open ) = η ⋅ 0.6 ⋅ 0.95 = 0.57 η b e l ( X 2 = is_closed ) = η p ( Z 2 = sense_open ∣ X 2 = is_closed ) b e l ˉ ( X 2 = is_closed ) = η ⋅ 0.2 ⋅ 0.05 = 0.01 η ↓ 归一化 η = ( 0.57 + 0.01 ) − 1 ↓ b e l ( X 1 = is_open ) ≈ 0.983 b e l ( X 1 = is_closed ) ≈ 0.017 \begin{aligned} bel(X_2=\text{is\_open}) &= \eta p(Z_2 = \text{sense\_open}|X_2=\text{is\_open})\bar{bel}(X_2=\text{is\_open})\\ &=\eta \cdot 0.6 \cdot 0.95\\ &=0.57\eta \end{aligned}\\ \begin{aligned} bel(X_2=\text{is\_closed}) &= \eta p(Z_2 = \text{sense\_open}|X_2=\text{is\_closed})\bar{bel}(X_2=\text{is\_closed})\\ &=\eta \cdot 0.2 \cdot 0.05\\ &=0.01\eta \end{aligned}\\ \downarrow \text{归一化}\eta = (0.57+0.01)^{-1}\\ \downarrow\\ bel(X_1=\text{is\_open}) \approx 0.983\\ bel(X_1=\text{is\_closed}) \approx 0.017 bel(X2=is_open)=ηp(Z2=sense_openX2=is_open)belˉ(X2=is_open)=η0.60.95=0.57ηbel(X2=is_closed)=ηp(Z2=sense_openX2=is_closed)belˉ(X2=is_closed)=η0.20.05=0.01η归一化η=(0.57+0.01)1bel(X1=is_open)0.983bel(X1=is_closed)0.017
  • 同理,计算 t = 3 , 4 , ⋯ t=3, 4, \cdots t=3,4,时刻的置信率 b e l ( x t ) bel(x_t) bel(xt)

4 分析

  • 算法第3行叫做:控制更新或者预测,因为这步是在机器人采取控制动作的前提下计算
  • 算法第4行叫做:测量更新,因为这步是在机器人根据传感器的测量信息完成计算
  • η \eta η仅为归一化使用,在最后才计算