Sequence to Sequence Learning with Neural Networks学习笔记

论文的主要创新点

提出了序列到序列的学习方法

提出的背景

DNN的限制：

输入和输出向量维度固定，实际上很多序列问题中的序列长度不是已知先验

单个RNN 的限制：

输入和输出等长，且要一一对齐且对齐已知，无法应用到输入输出不等长且对应关系为非montonic

解决方案

两个RNN理论可行

一个负责将输入充列map 为固定大小的vector（背景向量，含有整个输入句子的信息），另一个RNN将背景向列map为target sequence .
但是由于long term depencies 难以训练

两个LSTM 可行

同时将输入inverse ,效果更好，作者认为：introduction many short dependence because of minimal time lag
可以感知语序，和语义，语义近，距离近。representative sensitive to the orders of words,对主动语态和被动语态影响不大。

Related work

此工作与[18] N. Kalchbrenner and P. Blunsom. Recurrent continuous translation models. In EMNLP, 2013.紧密相关，其首次实现sentence to vector to sentence ,区别在于，使用CNN map 句子 to 向量，且没有考虑句子的语序。

[5] K. Cho, B. Merrienboer, C. Gulcehre, F. Bougares, H. Schwenk, and Y. Bengio. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Arxiv preprint arXiv:1406.1078,
2014.

[2] D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate.
arXiv preprint arXiv:1409.0473, 2014