Amazon reviews - Full-数据集

34,686,770 条来自 6,643,669 名亚马逊用户针对 2,441,053 款产品的评论,数据集主要来源于斯坦福网络分析项 目(SNAP)。数据集的每个类别分别包含 600,000 个训练样本和 130,000 个测试样本。

This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several large-scale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results. Comparisons are offered against traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks.

译:

本文对字符级卷积网络(ConvNets)在文本分类中的应用进行了实证研究。我们构建了几个大规模的数据集,以证明字符级卷积网络可以达到最先进或最具竞争力的结果。比较了传统模型,如单词包、n-grams及其TFIDF变体,以及基于单词的ConvNets和递归神经网络等深度学习模型。

大家可以到官网地址下载数据集,我自己也在百度网盘分享了一份。可关注本人公众号,回复“2020082105”获取下载链接。

 


 

只要自己有时间,都尽量写写文章,与大家交流分享。

本人公众号:

Amazon reviews - Full-数据集

****博客地址:https://blog.****.net/ispeasant