论文阅读——LANE-Label Informed Attributed Network Embedding原理即实现

带标签数据的sku嵌入方法

方法名:Label Informed Attributed Network Embedding

简称:LANE

sku嵌入向量中应包括:user对sku的行为,sku属性,sku标签

算法基本流程

  • 从用户对sku的pv序列构造网络
  • 清洗出sku属性
  • 输入模型计算嵌入 LANE(network,attribute,(label),dim)
  • sku嵌入向量评估
  • 输入seq2seq训练
  • 预测

LANE 细节

网络的构造

论文阅读——LANE-Label Informed Attributed Network Embedding原理即实现

  • 从hive表里得到用户对sku的pv序列
  • 将相邻的sku关系,设置为网络中node之间有一条有向边

算法伪代码

Algorithm :Label Informed Attributed Network Embedding
Input: dd(嵌入维度)
Input: max_itermax\_iter(迭代次数)
Input: GG(带权邻接矩阵)
Input: AA(属性矩阵)
Input: α1\alpha _{1},α2\alpha_{2}(权重参数)

Output:H(sku嵌入矩阵)

设sku数量(即构造图中的节点数量)为nn,sku属性的维度为mm, sku标签的维度为kk,sku嵌入向量维度为dd

GRnn,ARnm,YRnkG \in R^{n*n}, A \in R^{n*m}, Y \in R^{n*k}

S(G),S(A)R(nn)S^{(G)},S^{(A)} \in R^{(n*n)}

L(G),L(A),L(Y)RnnL^{(G)}, L^{(A)}, L^{(Y)} \in R^{n*n}

U(G),U(A),U(Y),HRndU^{(G)}, U^{(A)}, U^{(Y)},H \in R^{n*d}

1 : Construct the affinity matrices S(G)S^{(G)} and S(A)S^{(A)}
2 : Compute Laplacian matrices L(G)L^{(G)} , L(A)L^{(A)} and L(Y)L^{(Y)}
3 : Initialize t=1t = 1, U(A)=0,U(Y)=0,H=0U^{(A)}=0, U^{(Y)} =0,H=0
4 : repeat
5 :      Update U(G)U^{(G)}
(L(G)+α1U(A)U(A)T+α2U(Y)U(Y)T+HHT)U(G)=λ1U(G)(L^{(G)} + \alpha_{1} U^{(A)} U^{(A)^{T}} + \alpha_{2} U^{(Y)} U^{(Y)^{T}} + HH^{T})U^{(G)} = \lambda_{1}U^{(G)}
6 :      Update U(A)U^{(A)}
(α1L(A)+α1U(G)U(G)T+HHT)U(A)=λ2U(A)(\alpha_{1}L^{(A)} + \alpha_{1} U^{(G)} U^{(G)^{T}} + HH^{T})U^{(A)} = \lambda_{2}U^{(A)}
7 :      Update U(A)U^{(A)}
(α2L(YY)+α2U(G)U(G)T+HHT)U(Y)=λ3U(Y)(\alpha_{2}L^{(YY)} + \alpha_{2} U^{(G)} U^{(G)^{T}} + HH^{T})U^{(Y)} = \lambda_{3}U^{(Y)}
8 :      Update HH
(U(G)U(G)T+U(A)U(A)T+U(Y)U(Y)T)H=λ4H(U^{(G)} U^{(G)^{T}} + U^{(A)} U^{(A)^{T}} + U^{(Y)} U^{(Y)^{T}})H = \lambda_{4}H
9 : t=t+1t = t +1
10 : until max_iter
11 : return H

spark关键代码