统计推断(一) Hypothesis Test
1. Binary Bayesian hypothesis testing
1.0 Problem Setting
- Hypothesis
- Hypothesis space
- Bayesian approach: Model the valid hypothesis as an RV H
- Prior
- Observation
- Observation space
- Observation Model
- Decision rule
- Cost function
- Let
- is valid if
- Optimum decision rule
1.1 Binary Bayesian hypothesis testing
Theorem: The optimal Bayes’ decision takes the form
Proof:
KaTeX parse error: No such environment: align at position 8: \begin{̲a̲l̲i̲g̲n̲}̲ \varphi(f) &=…
Given
- if ,
- if ,
So
备注:证明过程中,注意贝叶斯检验为确定性检验,因此对于某个确定的 y, 的概率要么为 0 要么为 1。因此对代价函数求期望时,把 H 看作是随机变量,而把 看作是确定的值来分类讨论
Special cases
- Maximum a posteriori (MAP)
- Maximum likelihood (ML)
1.2 Likelyhood Ratio Test
Generally, LRT
- Bayesian formulation gives a method of calculating
- is a sufficient statistic for the decision problem
- 的可逆函数也是充分统计量
充分统计量
1.3 ROC
- Detection probability
- False-alarm probability
性质(重要!)
- LRT 的 ROC 曲线是单调不减的
2. Non-Bayesian hypo test
- Non-Bayesian 不需要先验概率或者代价函数
Neyman-Pearson criterion
Theorem(Neyman-Pearson Lemma):NP 准则的最优解由 LRT 得到,其中 由以下公式得到
Proof:物理直观:同一个 时 LRT 的 最大。物理直观来看,LRT 中判决为 H1 的区域中 都尽可能大,因此 相同时 可最大化
备注:NP 准则最优解为 LRT,原因是
- 同一个 时, LRT 的 最大
- LRT 取不同的 时, 越大,则 也越大,即 ROC 曲线单调不减
3. Randomized test
3.1 Decision rule
-
Two deterministic decision rules
-
Randomized decision rule by time-sharing
- Detection prob
- False-alarm prob
-
A randomized decision rule is fully described by for m=0,1
3.2 Proposition
-
Bayesian case: cannot achieve a lower Bayes’ risk than the optimum LRT
Proof: Risk for each y is linear in , so the minima is achieved at 0 or 1, which degenerate to deterministic decision
KaTeX parse error: No such environment: align at position 8: \begin{̲a̲l̲i̲g̲n̲}̲ \varphi(\mathb… -
Neyman-Pearson case:
- continuous-valued: For a given constraint, randomized test cannot achieve a larger than optimum LRT
- discrete-valued: For a given constraint, randomized test can achieve a larger than optimum LRT. Furthermore, the optimum rand test corresponds to simple time-sharing between the two LRTs nearby
3.3 Efficient frontier
Boundary of region of achievable operation points
- continuous-valued: ROC of LRT
- discrete-valued: LRT points and the straight line segments
Facts
- efficient frontier is concave function
4. Minmax hypo testing
prior: unknown, cost fun: known
4.1 Decision rule
-
minmax approach
-
optimal decision rule
要想证明上面的最优决策,首先引入 mismatch Bayes decision
代价函数如下,可得到 与概率 成线性关系
Lemma: Max-min inequality
Theorem:
Proof of Lemma: Let
Proof of Thm: 先取 ,可得到
由于 任取时上式都成立,因此可以取要想证明定理则只需证明
由前面可知 与 成线性关系,因此要证明上式
- 若 ,只需 ,等式自然成立
- 若 ,只需 ,最优解就是 ; 同理
根据下面的引理,可以得到最优决策就是 Bayes 决策 ,其中 满足
Lemma: