多层的Tiff标记数据集转换为格式是张量流可用于模型优化

问题描述:

我是一个Python和张量流量的新手,并想知道...多层的Tiff标记数据集转换为格式是张量流可用于模型优化

如何最好地为多层的标记数据集转换Tiffs Flow可以用于模型优化/微调的格式?

目前,我有这个代码,使多TIFF格式的文件夹中的每个图层转换为3D阵列,但我需要保存多TIFF格式的标签或文件名。我已经看到了一些张量流脚本转换为TFRecords,但是,我不知道如果这些保存的文件名?你最好如何去做这件事?这将是一个相当大的数据集。

大加赞赏任何帮助

import os # For file handling 
from PIL import Image# Import Pillow image processing library 
import numpy 
CroppedMultiTiffs = "MultiTiffs/" 

for filename in os.listdir(MultiTiffs): 
## Imports Multi-Layer TIFF into 3D Numpy Array. 

    img = Image.open(MultiTiffs + filename) 
    imgArray = numpy.zeros((img.n_frames, img.size[1], img.size[0]),numpy.uint8) 
try: 
# for frames in range, img.n_frames for whole folder. 
    for frame in range(2,img.n_frames): 
     img.seek(frame) 
     imgArray[frame,:,:] = img 
     frame = frame + 1 
except (EOFError): img.seek(0) 
    # output error if it doesn't find a file. 
pass 

print(imgArray.shape) # imgArray is now 3D 
print(imgArray.size) 

祝愿

TWP

好了,所以我想通了使用线程Daniils博客 http://warmspringwinds.github.io/tensorflow/tf-slim/2016/12/21/tfrecords-guide/

但是我目前的执行力度产生多TFRecords,我认为它需要一个TFRecord,所以试图找出如何使其成为一个单一的TFRecord。我怎么做?

然后我就可以使用TFRecord读剧本读回,并检查它是在张量流格式正确验证。我目前使用阅读脚本获取错误。

from PIL import Image 
import numpy as np 
import tensorflow as tf 
import os 

def _bytes_feature(value): 
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value])) 

def _int64_feature(value): 
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value])) 

path = 'test/' 
output = 'output/' 

fileList = [os.path.join(dirpath, f) for dirpath, dirnames, files in os.walk(path) for f in files if f.endswith('.tif')] 

print (fileList) 
for filename in fileList: 

    basename = os.path.basename(filename) 
    file_name = basename[:-4] 
    print ("processing file: " , filename) 
    print (file_name) 

    if not os.path.exists(output): 
     os.mkdir(output) 

    writer = tf.python_io.TFRecordWriter(output+ file_name + '.tfrecord') 
    img = Image.open(filename) 
    imgArray = np.zeros((img.n_frames, img.size[1], img.size[0]),np.uint8) 
    ## Imports Multi-Layer file into 3D Numpy Array. 
    try: 
     for frame in range(0,img.n_frames): 
      img.seek(frame) 
      imgArray[frame,:,:] = img 
      frame = frame + 1 
    except (EOFError): img.seek(0) 

    pass 

    print ("print img size:" , img.size) 
    print ("print image shape: " , imgArray.shape) 
    print ("print image size: " , imgArray.size) 

    annotation = np.array(Image.open(filename)) 

    height = imgArray.shape[0] 
    width = imgArray.shape[1] 
    depth = imgArray.shape[2] 

    img_raw = imgArray.tostring() 
    annotation_raw = annotation.tostring() 

    example = tf.train.Example(features=tf.train.Features(feature={ 
     'height': _int64_feature(height), 
     'width': _int64_feature(width), 
     'depth': _int64_feature(depth), # for 3rd dimension 
     'image_raw': _bytes_feature(img_raw), 
     'mask_raw': _bytes_feature(annotation_raw)})) 

    writer.write(example.SerializeToString()) 

我当前TFRecords读脚本

import tensorflow as tf 
import os 

def read_and_decode(filename_queue): 
    reader = tf.TFRecordReader() 
    _, serialized_example = reader.read(filename_queue) 
    features = tf.parse_single_example(
     serialized_example, 
     # Defaults are not specified since both keys are required. 
     features={ 
      'image_raw': tf.FixedLenFeature([], tf.string), 
      'label': tf.FixedLenFeature([], tf.int64), 
      'height': tf.FixedLenFeature([], tf.int64), 
      'width': tf.FixedLenFeature([], tf.int64), 
      'depth': tf.FixedLenFeature([], tf.int64) 
     }) 
    image = tf.decode_raw(features['image_raw'], tf.uint8) 
    label = tf.cast(features['label'], tf.int32) 
    height = tf.cast(features['height'], tf.int32) 
    width = tf.cast(features['width'], tf.int32) 
    depth = tf.cast(features['depth'], tf.int32) 
    return image, label, height, width, depth 

with tf.Session() as sess: 
    filename_queue = tf.train.string_input_producer(["output/A.3.1.tfrecord"]) 
    image, label, height, width, depth = read_and_decode(filename_queue) 
    image = tf.reshape(image, tf.stack([height, width, 3])) 
    image.set_shape([32,32,3]) 
    init_op = tf.initialize_all_variables() 
    sess.run(init_op) 
    coord = tf.train.Coordinator() 
    threads = tf.train.start_queue_runners(coord=coord) 
    for i in range(1000): 
    example, l = sess.run([image, label]) 
    print (example,l) 
    coord.request_stop() 
    coord.join(threads) 

接收错误: -

InvalidArgumentError(参见上述用于回溯):名称:,特点:标签(数据类型:int64类型)是必需但无法找到。

图像是灰度多页