Learning Based Digital Matting小结

Title ：Learning Based Digital Matting

论文链接: https://pan.baidu.com/s/1kXgQq39 密码: y5qq

matlab code链接： http://download.****.net/download/on_theway10/10112302

一般而言，Image Matting由alpha matte估计和前后背景F、B的计算这两个task构成。假定输入的image为I，pixels的集合为 Learning Based Digital Matting小结，n表示pixel的个数，对于任意的pixel，记为(d=1，表示gray image；d=3，表示bgr image)。位于scribble或者trimap中的pixel的集合记为，其中，其余部分的pixel属于集合。将中的pixel细分为前景(相应pixel的alpha值为1)和 Learning Based Digital Matting小结 (相应pixel的值为0)，以下图为例，(c)中的蓝色表示前景，红色表示背景。

作者将alpha matte估计视为一个学习问题，具体分为基于scribble的local le Learning Based Digital Matting小结 arning和基于trimap的global learning。这里先简单说下scribble和trimap的区别，scribble是指只提供了粗略的前后景线索的图（eg：子图(c)）；trimap是指提供了相对而言已经非常精准的前后景线索的图，与scribble相比，显然trimap能够提供的有用信息更多，最终通过matting算法得到的alpha map也更加精准；但是，提供trimap所耗费的时间等其他资源要比scribble多很多。

这里，重点分析一下作者强调的基于semi-supervised learning的local learning策略。一般的半监督学习策略有效地利用到数据的manifold structure，而local learning中，数据的manifold structure退化为image pixels所构成的image intensity surface，如子图(f)所示，它的x-axis为image.cols，y-axis为image.rows，z-axis为pixel的intensity，这与一般的semi-supervised learning中某个sample的领域点的计算方式不同。

1. Alpha Matte Estimation

首先来看alpha matte的估计。先给出matting问题的compositing equation：

Learning Based Digital Matting小结

这个似乎没有太多需要说明的，在之前matting的paper中就已经出现过。对于任意的pixel Learning Based Digital Matting小结，假设可由其邻域相关pixle对应的alpha值线性表示【这个假设合理吗？大家可以自己脑补】，邻域的大小可以自己设定，论文中选择的大小为7*7。这个假设合理吗？记，用来表示领域中pixels的alpha所构成的向量， Learning Based Digital Matting小结来表示组合系数，则可以表示为：

Learning Based Digital Matting小结

将上式重写，可以得到

Learning Based Digital Matting小结

其中， Learning Based Digital Matting小结，。对于的分量，如果不在的领域中，则直接置零；否则，参照(2)中该位置相应的取值。进一步，可以扩展(3)得到

Learning Based Digital Matting小结

其中，F的是一个n*n的矩阵，其第i列为 Learning Based Digital Matting小结。如果我们知道了F，可以通过如下的方式来获得：

Learning Based Digital Matting小结

这个优化目标函数就是经典的semi-supervised learning 框架，函数的第一部分描述了半监督信息，第二部分则是刻画了监督学习的误差。将(5)经过简单变形可得：

Learning Based Digital Matting小结

上式中表示 Learning Based Digital Matting小结表示n*n的单位矩阵，C表示c.*，表示将扩充为n维向量。对上式关于alpha向量求导、置零，可解得

Learning Based Digital Matting小结

要想获得alpha，我们得知道F。为了计算F，作者作了这样的假设：每一个pixel的alpha值可以由相关邻近pixels的线性组合表示出来。看到这里觉得有点莫名其妙，为什么可以这样假设，合理吗？下面我们从matting问题的compositing equation（即公式(1)）入手来分析：

对上式做简单变形，可以得到 Learning Based Digital Matting小结，然后我们可以把alpha_i视为I_i的函数，再来看作者的假设更容易接受一些。

在local learning环节，作者选用linear alpha-color model：

Learning Based Digital Matting小结

其中， Learning Based Digital Matting小结为(d+1)维向量。跟之前采用的符合一样，记、(这里，为m维向量，为m*(d+1)矩阵，其中m表示领域的大小)。为了估计(8)中涉及到的beta，可采用ridge regression技巧得

Learning Based Digital Matting小结

对于(9)而言，beta的最优解为

Learning Based Digital Matting小结

把(10)带回(8)易得

Learning Based Digital Matting小结

可以看出已经与 Learning Based Digital Matting小结无关，而是仅仅依赖于和。到此，我们可以可以清楚地看到作者一直强调的alpha-color model (8)终于显山露水，看上去确实挺优雅的。

进一步，作者引入了kernel trick，具体地：

Learning Based Digital Matting小结

这里， Learning Based Digital Matting小结表示kernel function(本质是一个非线性映射)，一般地p往往大于d(d表示x的维数)，熟悉SVM的小伙伴对这样的操作应该习以为常了吧。在(11)中用操作代替和的内积操作，可以得到

Learning Based Digital Matting小结

同理，可用 Learning Based Digital Matting小结代替，即

Learning Based Digital Matting小结

这样得到kernel function下线性组合系数 Learning Based Digital Matting小结的表达式为

Learning Based Digital Matting小结

2. Learning based matting Vs. Closed-Form matting

先简单回顾下Closed-Form Matting，它的假设基础为local linear model，即在local window内，F_i和B_i可视为常数：

Learning Based Digital Matting小结

它的cost function和alpha向量依次为：

Learning Based Digital Matting小结、

其中L表示Matting Laplacian matrix。

重写(6)可得

Learning Based Digital Matting小结

其中Matting Laplacian matrix为 Learning Based Digital Matting小结。下面列出Learning based matting 和Closed-form based matting的区别：

1.Learning based matting是基于semi-supervised learning进行的，而Closed-form based matting则是基于local linear model的；

2.Closed-form matting的基本假设是Local smooth(它蕴含了color-line model)，这个假设在复杂场景中不一定继续成立;相比之下，Learning-based matting则通过引入kernel trick扩展了alpha-color model，使其具有非线性表达能力；

3.Closed-form matting框架中，trimap中的确定性信息通过hard constraint形式进行编码；Learning based matting框架中，trimap中的确定性信息则是通过soft constraint(可以通过调整(5)中监督学习部分的参数c来控制监督信息的利用程度)形式进行编码的；

3. Global Learning based matting

前面叙述的Learning-based matting是local的，它对image中的每一个pixel对应的alpha都进行相应的建模。当用户提供的trimap很细致的时候，继续采用local matting就显得很费时、费力了，作者提出的Global learning based matting正是基于这种考虑，它仅仅对trimap中不确定的那部分像素进行建模，然后依次求解。下面先给出算法流程，然后来分析里面的技术细节。

Learning Based Digital Matting小结

上面的算法流程中，step-1中的 Learning Based Digital Matting小结表示全局距离阈值；step-2中的和依次表示构建备选前景、背景集合，它们中任意pixel到指标位于中的pixel的距离小于，这里其实已经蕴含了nonlocal-matting的思想；其它环节跟local learning部分类似，故不再一一赘述。

4. local_learning 和 golbal_learning的区别

从前面的叙述中，我们可以看到，local_learning是对整个input的pixels都要进行遍历计算的，这将比较耗费运算量；而global_learning则仅仅遍历unknown区域中的pixels。因此，当用户提供的是scribble（可以理解为粗粒度的trimap），这时unknown区域很大，用globall_learning则可能出现预测不精准，故采用locall_learning；当用户提供的trimap比较细致的时候，unknown区域比较狭小，则建议使用globall_learning。下面贴出作者在论文中的对比试验图：

Learning Based Digital Matting小结

图中，横坐标trimap level值越大，表示trimap越粗糙；纵坐标表示mse度量。

5. 结语

总体而言，learning based matting 不需要迭代，容易嵌入非线性映射，具有closed form solution，在工程应用中比较广泛。

Learning Based Digital Matting小结

相关推荐