您的位置: 首页 > 文章 > 《FutureGAN: Anticipating the Future Frames of Video Sequences using ……》论文笔记

《FutureGAN: Anticipating the Future Frames of Video Sequences using ……》论文笔记

分类: 文章 • 2024-06-03 14:54:58

FutureGAN: Anticipating the Future Frames of Video Sequences using Spatio-Temporal 3d Convolutions in Progressively Growing Autoencoder GANs

摘要

本文使用Autoencoder GAN模型，特点是网络结构简单，生成图片比较真实，但是最终的预测结果惨不忍睹。

主要工作

作者的主要工作就是复现了Karras等人的论文Progressive Growing of GANs for Improved Quality, Stability, and Variation中的PGGAN，将原本的图像生成网络改造为视频预测中的网络。

encoder和decoder中引入3D convolution/3D transposed conv来捕捉时序特征
训练时输入图像的分辨率从小逐渐增大，同时网络逐渐增加 layer 数目

网络结构

就是一个GAN，G就是一个encoder+decoder，

代价函数

WGAN-GP loss

未来方向

作者提到他们未来的方向是在鉴别器D中输入相邻的几帧图像，以防止G生成图像时物体变化太大，比如下面图中公路都变成土路了……
《FutureGAN: Anticipating the Future Frames of Video Sequences using ……》论文笔记