Learning Based Digital Matting小结

Title :Learning Based Digital Matting

论文链接: https://pan.baidu.com/s/1kXgQq39  密码: y5qq

matlab code链接 : http://download.****.net/download/on_theway10/10112302

     一般而言,Image Matting由alpha matte估计和前后背景F、B的计算这两个task构成。假定输入的image为I,pixels的集合为Learning Based Digital Matting小结,n表示pixel的个数,对于任意的pixelLearning Based Digital Matting小结,记为Learning Based Digital Matting小结(d=1,表示gray image;d=3,表示bgr image)。位于scribble或者trimap中的pixel的集合记为Learning Based Digital Matting小结,其中Learning Based Digital Matting小结,其余部分的pixel属于集合Learning Based Digital Matting小结。将Learning Based Digital Matting小结中的pixel细分为前景Learning Based Digital Matting小结(相应pixel的alpha值为1)和Learning Based Digital Matting小结(相应pixel的值为0),以下图为例,(c)中的蓝色表示前景,红色表示背景。

Learning Based Digital Matting小结

作者将alpha matte估计视为一个学习问题,具体分为基于scribble的local leLearning Based Digital Matting小结arning和基于trimap的global learning。这里先简单说下scribble和trimap的区别,scribble是指只提供了粗略的前后景线索的图(eg:子图(c));trimap是指提供了相对而言已经非常精准的前后景线索的图,与scribble相比,显然trimap能够提供的有用信息更多,最终通过matting算法得到的alpha map也更加精准;但是,提供trimap所耗费的时间等其他资源要比scribble多很多。

       这里,重点分析一下作者强调的基于semi-supervised learning的local learning策略。一般的半监督学习策略有效地利用到数据的manifold structure,而local learning中,数据的manifold structure退化为image pixels所构成的image intensity surface,如子图(f)所示,它的x-axis为image.cols,y-axis为image.rows,z-axis为pixel的intensity,这与一般的semi-supervised learning中某个sample的领域点的计算方式不同。

 1. Alpha Matte Estimation

         首先来看alpha matte的估计。先给出matting问题的compositing equation:

Learning Based Digital Matting小结

这个似乎没有太多需要说明的,在之前matting的paper中就已经出现过。对于任意的pixel Learning Based Digital Matting小结,假设Learning Based Digital Matting小结可由其邻域相关pixle对应的alpha值Learning Based Digital Matting小结线性表示【这个假设合理吗?大家可以自己脑补】,邻域Learning Based Digital Matting小结的大小可以自己设定,论文中选择的大小为7*7。这个假设合理吗?记Learning Based Digital Matting小结,用Learning Based Digital Matting小结来表示领域Learning Based Digital Matting小结中pixels的alpha所构成的向量,Learning Based Digital Matting小结来表示组合系数,则Learning Based Digital Matting小结可以表示为:

Learning Based Digital Matting小结

将上式重写,可以得到

Learning Based Digital Matting小结

其中,Learning Based Digital Matting小结Learning Based Digital Matting小结Learning Based Digital Matting小结。对于Learning Based Digital Matting小结的分量,如果不在Learning Based Digital Matting小结的领域Learning Based Digital Matting小结中,则直接置零;否则,参照(2)中该位置相应的取值。进一步,可以扩展(3)得到

Learning Based Digital Matting小结

其中,F的是一个n*n的矩阵,其第i列为Learning Based Digital Matting小结。如果我们知道了F,Learning Based Digital Matting小结可以通过如下的方式来获得:

Learning Based Digital Matting小结

这个优化目标函数就是经典的semi-supervised learning 框架,函数的第一部分描述了半监督信息,第二部分则是刻画了监督学习的误差。将(5)经过简单变形可得:

Learning Based Digital Matting小结

上式中表示Learning Based Digital Matting小结表示n*n的单位矩阵,C表示c.*Learning Based Digital Matting小结Learning Based Digital Matting小结表示将Learning Based Digital Matting小结扩充为n维向量。对上式关于alpha向量求导、置零,可解得

Learning Based Digital Matting小结

要想获得alpha,我们得知道F。为了计算F,作者作了这样的假设:每一个pixel的alpha值可以由相关邻近pixels的线性组合表示出来。看到这里觉得有点莫名其妙,为什么可以这样假设,合理吗?下面我们从matting问题的compositing equation(即公式(1))入手来分析:

对上式做简单变形,可以得到Learning Based Digital Matting小结,然后我们可以把alpha_i视为I_i的函数,再来看作者的假设更容易接受一些。

在local learning环节,作者选用linear alpha-color model:

Learning Based Digital Matting小结

其中,Learning Based Digital Matting小结为(d+1)维向量。跟之前采用的符合一样,记Learning Based Digital Matting小结Learning Based Digital Matting小结(这里,Learning Based Digital Matting小结为m维向量,Learning Based Digital Matting小结为m*(d+1)矩阵,其中m表示领域Learning Based Digital Matting小结的大小)。为了估计(8)中涉及到的beta,可采用ridge regression技巧得

Learning Based Digital Matting小结

对于(9)而言,beta的最优解为

Learning Based Digital Matting小结

把(10)带回(8)易得

Learning Based Digital Matting小结

可以看出已经与Learning Based Digital Matting小结无关,而是仅仅依赖于Learning Based Digital Matting小结Learning Based Digital Matting小结。到此,我们可以可以清楚地看到作者一直强调的alpha-color model (8)终于显山露水,看上去确实挺优雅的。

       进一步,作者引入了kernel trick,具体地:

Learning Based Digital Matting小结

这里,Learning Based Digital Matting小结表示kernel function(本质是一个非线性映射),一般地p往往大于d(d表示x的维数),熟悉SVM的小伙伴对这样的操作应该习以为常了吧。在(11)中用Learning Based Digital Matting小结操作代替Learning Based Digital Matting小结Learning Based Digital Matting小结的内积操作,可以得到

Learning Based Digital Matting小结

同理,可用Learning Based Digital Matting小结代替Learning Based Digital Matting小结,即

Learning Based Digital Matting小结

这样得到kernel function下线性组合系数Learning Based Digital Matting小结的表达式为

Learning Based Digital Matting小结

 

 2. Learning based matting Vs. Closed-Form matting

 

    先简单回顾下Closed-Form Matting,它的假设基础为local linear model,即在local window内,F_i和B_i可视为常数:

Learning Based Digital Matting小结

它的cost function和alpha向量依次为:

Learning Based Digital Matting小结Learning Based Digital Matting小结

其中L表示Matting Laplacian matrix。

        重写(6)可得

Learning Based Digital Matting小结

其中Matting Laplacian matrix为Learning Based Digital Matting小结。下面列出Learning based matting 和Closed-form based matting的区别:

1.Learning based matting是基于semi-supervised learning进行的,而Closed-form based matting则是基于local linear model的;

2.Closed-form matting的基本假设是Local smooth(它蕴含了color-line model),这个假设在复杂场景中不一定继续成立;相比之下,Learning-based matting则通过引入kernel trick扩展了alpha-color model,使其具有非线性表达能力;

3.Closed-form matting框架中,trimap中的确定性信息通过hard constraint形式进行编码;Learning based matting框架中,trimap中的确定性信息则是通过soft constraint(可以通过调整(5)中监督学习部分的参数c来控制监督信息的利用程度)形式进行编码的;

 

 3. Global Learning based matting

 

 前面叙述的Learning-based matting是local的,它对image中的每一个pixel对应的alpha都进行相应的建模。当用户提供的trimap很细致的时候,继续采用local matting就显得很费时、费力了,作者提出的Global learning based matting正是基于这种考虑,它仅仅对trimap中不确定的那部分像素进行建模,然后依次求解。下面先给出算法流程,然后来分析里面的技术细节。

Learning Based Digital Matting小结

上面的算法流程中,step-1中的Learning Based Digital Matting小结表示全局距离阈值;step-2中的Learning Based Digital Matting小结Learning Based Digital Matting小结依次表示构建备选前景、背景集合,它们中任意pixel到指标位于Learning Based Digital Matting小结中的pixel的距离小于Learning Based Digital Matting小结,这里其实已经蕴含了nonlocal-matting的思想;其它环节跟local learning部分类似,故不再一一赘述。

4. local_learning 和 golbal_learning的区别

从前面的叙述中,我们可以看到,local_learning是对整个input的pixels都要进行遍历计算的,这将比较耗费运算量;而global_learning则仅仅遍历unknown区域中的pixels。因此,当用户提供的是scribble(可以理解为粗粒度的trimap),这时unknown区域很大,用globall_learning则可能出现预测不精准,故采用locall_learning;当用户提供的trimap比较细致的时候,unknown区域比较狭小,则建议使用globall_learning。下面贴出作者在论文中的对比试验图:

Learning Based Digital Matting小结

 

图中,横坐标trimap level值越大,表示trimap越粗糙;纵坐标表示mse度量。

5. 结语

    总体而言,learning based matting 不需要迭代,容易嵌入非线性映射,具有closed form solution,在工程应用中比较广泛。