Combining Sketch and Tone for Pencil Drawing Production

C. W. Lu, L. Xu, J. Y. Jia, Combining Sketch and Tone for Pencil Drawing Production, NPAR (2012) Best Paper

摘要

组合色调与线条结构（combining the tone and stroke structures）

铅笔画先验信息（prior knowledge on pencil drawing）

1 引言

铅笔画合成（pencil drawing synthesis）方法：（1）二维图像渲染（2D image-based rendering）；三维模型渲染（3D model-based rendering）。

素描（sketch）：缺乏细节的速写作品（a quickly ﬁnished work without a lot of details），用于描绘整体构图（global shape）和主要轮廓（main contours）；影线（hatching）：在相应区域中用平行暗线条描绘色调或阴影（tone or shading）。

文献阅读 - Combining Sketch and Tone for Pencil Drawing Production
本文提出一种二阶段线条、色调管理（a two-stage system with important stroke and tone management）：

（1）素描（sketch）：采用卷积框架线条生成（put stroke generation into a convolution framework）获取素描特征（capture the essential characteristics of pencil sketch）、模拟笔尖快速移动（simulate rapid nib movement in drawing）

（2）影线（hatching）：引入色调模式（tonal patterns）（没有主方向的密集线条（dense pencil strokes without dominant directions）），防止产生伪影（artifacts）；提出参数直方图模型用于色调调整（parametric histogram models are proposed to adjust the tone）

（3）全局最优指数模型（exponential model with global optimization）：色调调整、改善重纹理区域（heavily textured regions）和物体轮廓的渲染效果（object contours）

将该方法应用于彩色图像的亮度通道（luminance channel）可生成彩铅素描（color pencil drawing）。

2 相关工作

以往工作一般是根据局部结构确定线条和影线的方向（determines orientation of strokes and hatches based on local structures），但在高纹理或噪声区域中（highly textured or noisy regions），该方法并不稳定（unstable）。

3 模型

本文提出的基于图像的铅笔素描方法（Fig. (2)）包括两个步骤：（1）线条生成（pencil stroke generation）；（2）色调绘制（pencil tone drawing）。

线条表达场景构图（general structures of the scene）；色调侧重（focus）形状（shapes）和阴影（shadow,）。后者用于增强全局光照（depict global illumination）、阴影区域（accentuate shading and dark regions）的表现力。

文献阅读 - Combining Sketch and Tone for Pencil Drawing Production

3.1 线条（Line Drawing with Strokes）

在素描中，线条（stroke）的属性包括：线宽（thickness）、起伏（wiggliness）、明暗（brightness）；线条通常结束于弯曲点（points of curvature）或交叉点（junctions）且几乎没有连续长曲线。

文献阅读 - Combining Sketch and Tone for Pencil Drawing Production
分类（classiﬁcation）：

输入图像灰度图（grayscale）的梯度（gradient）为：

$\mathbf{G} = \left( (\partial_{x} \mathbf{I})^{2} + (\partial_{y} \mathbf{I})^{2} \right)^{\frac{1}{2}} \tag {1}$

其中， $\mathbf{I}$ 为灰度图（grayscale input）； $\partial_{x}$ 、 $\partial_{y}$ 表示梯度算子（gradient operators），可通过前向差分（forward difference）计算。实际图像（natural image）的梯度图（gradient map）通常是有噪的且不包含用于线条生成的连续边缘（continuous edges immediately ready for stroke generation）。

线条方向估计（the estimation of line direction for each pixel）：

（1）选择间隔为 $45 \degree$ 的8个参考方向（eight reference directions at 45 degrees apart），记为线段（line segments） $\left\{ \mathscr{L}_{i} \right\}, \ i \in \left\{ 1, \dots, 8 \right\}$ 。某个方向的响应图（the response map for a certain direction）为：

$\mathbf{G}_{i} = \mathscr{L}_{i} * \mathbf{G} \tag {2}$

其中， $\mathscr{L}_{i}$ 为第 $i$ 上方向的线段，并表示成卷积核形式（convolution kernel）；线段长度取值（经验值（empirically））为输入图像长（宽）的 $\frac{1}{30}$ ； $*$ 表示卷积算子（convolution operator），计算梯度在各方向上投影，构造滤波器响应图 $\mathbf{G}_{i}$ （groups gradient magnitudes along direction $i$ to form the ﬁlter response map $\mathbf{G}_{i}$ ）。

分类（classiﬁcation）：按梯度在所有方向上投影的最大值对梯度分类（classiﬁcation is performed by selecting the maximum value among the responses in all directions），

$\mathbf{C}_{i} = \begin{cases} \mathbf{G}(p) & \text{if }\argmax_{i} \mathbf{G}_{i} = i \\ 0 & \text{otherwise} \end{cases} \tag {3}$

其中， $p$ 为像素索引； $\mathbf{C}_{i}$ 为方向 $i$ 的幅度图（magnitude map），如Fig. (4)，且满足 $\sum_{i = 1}^{8} \mathbf{C}_{i} = \mathbf{G}$ 。分类策略的鲁棒性高（very robust against different types of noise），对噪声不敏感（no matter whether the gradient of the current pixel is noise contaminated or not）。

■■

原文Eq. (3)为

$\mathbf{C}_{i} = \begin{cases} \mathbf{G}(p) & \text{if }\argmin_{i} \mathbf{G}_{i} = i \\ 0 & \text{otherwise} \end{cases}$

有误

■

文献阅读 - Combining Sketch and Tone for Pencil Drawing Production

线条生成（line shaping）

给定集合 $\mathbf{C}_{i}$ ，在各像素点处，线条生成表示为：

$\mathbf{S}^{\prime} = \sum_{i = 1}^{8} \mathscr{L}_{i} * \mathbf{C}_{i}$

沿给定方向进行卷积平滑，能够连接原始梯度图中不相连的边缘像素（convolution aggregates nearby pixels along direction $\mathscr{L}_{i}$ , which links edge pixels that are even not connected in the original gradient map）。 $\mathbf{S}^{\prime}$ 经逆向处理并归一化后得到输出线条图（pencil stroke map） $\mathbf{S}$ （inverting pixel values and mapping them to $[0, 1]$ ）。

文献阅读 - Combining Sketch and Tone for Pencil Drawing Production
讨论（discussion）

生成线条时，该方法考虑了边缘形状和邻域像素（edge shape and neighborhood pixel support），能够准确捕捉素描特征（sketch nature）。

卷积能够沿长线条在交叉点处的端点（the convolution step extends the ends of two lines at the junction point）。由于定向卷积只考虑直线上的像素（directional convolution only aggregates pixels along strictly straight lines），因此只有原始边缘图（the original edge map）中的长直线（long straight lines）才会被延长。
由于直线中心区域的像素受两侧像素的影响，因此线条中心区域的笔墨比两端更重（pixels at the center of a long line would receive pixel values from both sides, which make the line center darker than the ends）。
通过连接原始边缘图中不相连的像素，模仿人类画线过程，提高鲁棒性（link pixels that are not necessarily connected in the original edge map when nearby pixels are mostly aligned along the straight line, imitating human line drawing process and resisting noise）。

文献阅读 - Combining Sketch and Tone for Pencil Drawing Production
由于自然图像的纹理区域可能包含大量细节（natural images contain many textured surfaces with ﬁne details），导致其梯度图中存在大量噪声（Fig. (7)），本文所提方法能够有效处理有噪梯度图。

文献阅读 - Combining Sketch and Tone for Pencil Drawing Production

3.2 色调（Tone Drawing）

影线（hatching），即稠密线条（dense strokes），用于表现阴影（shadow, shading）和较暗物体。本节给出一种纹理渲染（render pencil texture）方案。

色调图生成（tone map generation）

首先，根据原始灰度输入的信息，确定各像素色调值（determine the tone value for each pixel leveraging information in the original grayscale input）。

本文提出一种参数模型用于拟合素描的色调分布（a parametric model to ﬁt tone distributions of pencil sketch）。自然图像的色调通常变化明显（highly variable tones），而素描色调直方图（a sketch tone histogram）服从特定定模式（certain patterns）。这是因为素描有两种基本色调：（1）高光区域（bright regions）；（2）阴影区域（heavy strokes）。在两种基本色调之间是过渡区域，用于丰富画面层次感（mild tone strokes are produced to enrich the layer information）。

文献阅读 - Combining Sketch and Tone for Pencil Drawing Production
基于模型的色调转移（model-based tone transfer）

拟合色调分布（target tone distribution）参数模型（a parametric model）：

$p(v) = \frac{1}{Z} \sum_{i = 1}^{3} w_{i} p_{i}(v) \tag {4}$

其中， $v$ 表示色调值（tone value）； $p(v)$ 表示像素色调取值为 $v$ 时的概率； $Z$ 为归一化因子， $\int_{0}^{1} p(v) dv = 1$ ； $p_{i}(v)$ 表示铅笔画的三个色调层（tonal layer）；权值 $w_{i}$ 与各层像素数量相关（the weights coarsely corresponding to the number of pixels in each tonal layer）。此外，色调值需做归一化处理（scale the values into the range $[0, 1]$ to cancel out illumination difference in computing the distributions）。

文献阅读 - Combining Sketch and Tone for Pencil Drawing Production
由Fig. (9)可知，高光层和阴影层均有峰值存在，而过渡层呈均匀分布（the bright and dark layers have obvious peaks while the mild tone layer does not correspond to a unique peak）。因此，本文用拉普拉斯分布（Laplacian distribution）■单边拉普拉斯分布、指数分布■对高光层建模，其峰值（with its peak at the brightest value）为255（归一化后为1）

$p_{1} = \begin{cases} \frac{1}{\sigma_{b}} e^{- \frac{1 - v}{\sigma_{b}}}, & \ \text{if } v \leq 1 \\ 0, & \ \text{otherwise} \end{cases} \tag {5}$

其中， $\sigma_{b}$ 为分布的尺度（the scale of the distribution）；用均匀分布（uniform distribution）对过渡层（mild tone layer）建模

$p_{2} = \begin{cases} \frac{1}{u_{b} - u_{a}}, & \ \text{if } u_{a} \leq v \leq u_{b} \\ 0, & \ \text{otherwise} \end{cases} \tag {6}$

其中， $u_{a}$ 、 $u_{b}$ 分别为分布范围的上、下界；用高斯分布对阴影层建模

$p_{3}(v) = \frac{1}{\sqrt{2 \pi \sigma_{d}}} e^{- \frac{(v - \mu_{d})^{2}}{2 \sigma_{d}^{2}}} \tag {7}$

其中， $\mu_{d}$ 为深色线条（dark strokes）均值， $\sigma_{d}$ 为尺度参数。

参数学习（parameter learning）

文献阅读 - Combining Sketch and Tone for Pencil Drawing Production
对输入灰度图 $\mathbf{I}$ 轻微高斯平滑（slightly Gaussian smoothed）。设定各层的色调门限（the bright and dark tone layers are manually indicated with threshold of intensities and the remaining pixels are in the mild-tone layer），权值 $w$ 取决于各层中的像素数量（the number of pixels in each layer decides weights）；使用最大似然估计（Maximum Likelihood Estimation，MLE）计算各层参数。各层的均值（mean）和标准差（standard deviation）分别记为 $m$ 、 $s$ ，则各层参数的解析解（closed-form）分别为：

$\begin{aligned} & \sigma_{b} = \frac{1}{N} \sum_{i = 1}^{N} |x_{i} - 1| \\ & u_{a} = m_{m} - \sqrt{3} s_{m}, u_{b} = m_{m} + \sqrt{3} s_{m} \\ & \mu_{d} = m_{d}, \sigma_{d} = s_{d} \end{aligned}$

其中， $x_{i}$ 为像素值（pixel value）； $N$ 为该层所含像素个数。

■■

$x_{i} \in [0, 1]$

（1） $p_{1} (v)$ ， $x_{i} \in \mathcal{X}$ ， $\mathcal{X} = \{ x_{1}, \cdots, x_{N} \}$ 表示高光层所有像素的集合

$\begin{aligned} \sigma_{b}^{\ast} & = \argmax_{\sigma_{b}} p(\mathcal{X}; \sigma_{b}) \\ & = \argmax_{\sigma_{b}} \prod_{i = 1}^{N} p(x_{i}; \sigma_{b}) \\ & = \argmax_{\sigma_{b}} \sum_{i = 1}^{N} \log p(x_{i}; \sigma_{b}) \\ & = \argmax_{\sigma_{b}} \sum_{i = 1}^{N} \left( - \log \sigma_{b} - \frac{1 - x_{i}}{\sigma_{b}} \right) \\ \end{aligned}$

令 $\frac{\partial p}{\partial \sigma_{b}} = 0$ ，则

$\begin{aligned} & \sum_{i = 1}^{N} \left( - \frac{1}{\sigma_{b}} + \frac{1 - x_{i}}{\sigma_{b}^{2}} \right) = 0 \\ & \Downarrow \\ & \sigma_{b}^{\ast} = \frac{1}{N} \sum_{i = 1}^{N} (1 - x_{i}) \end{aligned}$

（2） $p_{2} (v)$ ， $x_{i} \in \mathcal{X}$ ， $\mathcal{X} = \{ x_{1}, \cdots, x_{N} \}$ 表示过渡层所有像素的集合，对 $u_{a}$ 、 $u_{b}$ 进行区间估计

$u_{a} = m_{m} - \sqrt{3} s_{m}, u_{b} = m_{m} + \sqrt{3} s_{m}$

（3） $p_{3} (v)$ ， $x_{i} \in \mathcal{X}$ ， $\mathcal{X} = \{ x_{1}, \cdots, x_{N} \}$ 表示阴影层所有像素的集合

$\mu_{d} = m_{d}, \sigma_{d} = s_{d}$

■

文献阅读 - Combining Sketch and Tone for Pencil Drawing Production
■ $w_{1}$ 、 $w_{3}$ 写反了■

根据参数模型 $p_{1}$ 、 $p_{2}$ 、 $p_{3}$ ，分别在三个子层上对输入图像直方图匹配，并将结果合并（based on the parametric $p_{1}$ , $p_{2}$ , and $p_{3}$ , for each new input image, we adjust the tone maps using simple histogram matching in all the three layers and superpose them again），如Fig. (11)。输出结果记为 $\mathbf{J}$ 。

文献阅读 - Combining Sketch and Tone for Pencil Drawing Production
纹理渲染（pencil texture rendering）

色调纹理是指没有明显方向的线条图样，只表示色调信息（tonal texture refers to pencil patterns without obvious direction, which reveal only the tone information）。本文利用色调图（tone maps）学习手绘色调图样（tonal patterns），如图12所示。

文献阅读 - Combining Sketch and Tone for Pencil Drawing Production
人工手绘色调纹理是在同一位置反复绘制（in human drawing, tonal pencil texture is generated by repeatedly drawing at the same place），本文使用多层线条（multiplication of strokes）模拟这一过程，其输出表示为指数组合（exponential combination） $\mathbf{H}^{\mathbf{\beta} (x)} (x) \approx \mathbf{J} (x)$ （或 $\mathbf{\beta} (x) \log \mathbf{H} (x) \approx \log \mathbf{J} (x)$ ），即为拟合 $\mathbf{J}$ 的局部色调，将 $\mathbf{H}$ 反复绘制 $\mathbf{\beta}$ 次（drawing pattern $\mathbf{H}$ $\mathbf{\beta}$ times to approximate the local tone in $\mathbf{J}$ ），如Fig. (12)。

此外， $\mathbf{\beta}$ 必需满足局部平滑（locally smooth），可通过最小化方程（8）求解：

$\mathbf{\beta}^{\ast} = \argmin_{\mathbf{\beta}} \| \mathbf{\beta} \log \mathbf{H} - \log \mathbf{J} (x) \|_{2}^{2} + \lambda \| \nabla \mathbf{\beta} \|_{2}^{2} \tag {8}$