如何将.pkl.gz文件从我的电脑导入python程序

问题描述:

我有我自己的数据集,我想训练我的模型。我已经成功创建.pk.gz文件,但我不知道如何将它们导入到我的模型中。如何将.pkl.gz文件从我的电脑导入python程序

我正在使用Windows 10,python 3.5.2与tensor-flow和tflearn和sublime text 3编写代码。

我用来创建泡菜文件中的代码:

from numpy import genfromtxt 

import gzip 
import _pickle as cPickle 

#data = sio.loadmat('C:/DeepLearning_lib/Theano/Data/test_x.mat') 

train_set_x = genfromtxt('C:/Users/Jay/Desktop/MachineLearning/dataset/NSL-KDD Processed/Kdd_Train_41.csv', delimiter=',') 

train_set_y = genfromtxt('C:/Users/Jay/Desktop/MachineLearning/dataset/NSL-KDD Processed/NSL_TrainLabels_mat4.csv', delimiter=',') 

valid_set_x = genfromtxt('C:/Users/Jay/Desktop/MachineLearning/dataset/NSL-KDD Processed/Kdd_Valid_41.csv', delimiter=',') 

valid_set_y = genfromtxt('C:/Users/Jay/Desktop/MachineLearning/dataset/NSL-KDD Processed/NSL_ValidLabels_int2.csv', delimiter=',') 

test_set_x = genfromtxt('C:/Users/Jay/Desktop/MachineLearning/dataset/NSL-KDD Processed/Kdd_Test_41.csv', delimiter=',') 

test_set_y = genfromtxt('C:/Users/Jay/Desktop/MachineLearning/dataset/NSL-KDD Processed/NSL_TestLabels_mat5.csv', delimiter=',') 



train_set = test_set_x 
train_set_labels= test_set_y 

valid_set = valid_set_x 
valid_set_labels= valid_set_y 

test_set = train_set_x 
test_set_labels= train_set_y 


f = gzip.open('C:/Users/Jay/Desktop/Data/train_set.pkl.gz','wb') 
cPickle.dump(train_set, f, protocol=2) 

f.close() 

f = gzip.open('C:/Users/Jay/Desktop/Data/train_set_labels.pkl.gz','wb') 
cPickle.dump(train_set_labels, f, protocol=2) 

f.close() 

f = gzip.open('C:/Users/Jay/Desktop/Data/valid_set_labels.pkl.gz','wb') 
cPickle.dump(valid_set_labels, f, protocol=2) 

f.close() 
f = gzip.open('C:/Users/Jay/Desktop/Data/test_set_labels.pkl.gz','wb') 
cPickle.dump(test_set_labels, f, protocol=2) 

f.close() 
f = gzip.open('C:/Users/Jay/Desktop/Data/valid_set.pkl.gz','wb') 
cPickle.dump(valid_set, f, protocol=2) 

f.close() 
f = gzip.open('C:/Users/Jay/Desktop/Data/test_set.pkl.gz','wb') 
cPickle.dump(test_set, f, protocol=2) 

f.close() 

错误:使用 'RB'

'OSError: [Errno 9] peek() on write-only GzipFile object' 
+0

灿你显示你用来创建文件的代码,并告诉我们它们包含什么样的数据?如果'.pkl'意味着你使用pickle.dump()来使用Python ['pickle'](https://docs.python.org/2/library/pickle.html)模块来编写,你应该能够使用'pickle.load()'来检索对象。 – mrry

+0

'进口的gzip 进口_pickle作为cPickle的 F = gzip.open( 'C:/Users/Jay/Desktop/Data/train_set.pkl.gz', 'WB') cPickle.load(train_set) 打印( train_set) f.close()'它没有导入 –

下面的代码应该重构,当你train_set

with gzip.open('C:/Users/Jay/Desktop/Data/train_set.pkl.gz', 'rb') as f: 
    train_set = cPickle.load(f) 
+0

'OSError:[Errno 9] peek()在只写GzipFile对象上收到这个错误! –

+0

糟糕,我从[你的评论]复制''wb''(http://stackoverflow.com/questions/42859435/how-would-i-import-pkl-gz-file-from-my-computer-into- a-python-program/42860527?noredirect = 1#comment72825802_42859435)而不考虑它的含义:)。改为“rb'',现在应该可以工作。 – mrry

+0

我将其更改为rb,因为它在文档中...没有影响 –