tensorflow record with float numpy array

numpy array keras
tensorflow records
store numpy array as tfrecord
tfrecord dataset
tensorflow to numpy
csv to tfrecord
tfrecordwriter
tfrecord reader

I want to create tensorflow records to feed my model; so far I use the following code to store uint8 numpy array to TFRecord format;

def _int64_feature(value):
  return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))


def _bytes_feature(value):
  return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))


def _floats_feature(value):
  return tf.train.Feature(float_list=tf.train.FloatList(value=[value]))


def convert_to_record(name, image, label, map):
    filename = os.path.join(params.TRAINING_RECORDS_DATA_DIR, name + '.' + params.DATA_EXT)

    writer = tf.python_io.TFRecordWriter(filename)

    image_raw = image.tostring()
    map_raw   = map.tostring()
    label_raw = label.tostring()

    example = tf.train.Example(features=tf.train.Features(feature={
        'image_raw': _bytes_feature(image_raw),
        'map_raw': _bytes_feature(map_raw),
        'label_raw': _bytes_feature(label_raw)
    }))        
    writer.write(example.SerializeToString())
    writer.close()

which I read with this example code

features = tf.parse_single_example(example, features={
  'image_raw': tf.FixedLenFeature([], tf.string),
  'map_raw': tf.FixedLenFeature([], tf.string),
  'label_raw': tf.FixedLenFeature([], tf.string),
})

image = tf.decode_raw(features['image_raw'], tf.uint8)
image.set_shape(params.IMAGE_HEIGHT*params.IMAGE_WIDTH*3)
image = tf.reshape(image_, (params.IMAGE_HEIGHT,params.IMAGE_WIDTH,3))

map = tf.decode_raw(features['map_raw'], tf.uint8)
map.set_shape(params.MAP_HEIGHT*params.MAP_WIDTH*params.MAP_DEPTH)
map = tf.reshape(map, (params.MAP_HEIGHT,params.MAP_WIDTH,params.MAP_DEPTH))

label = tf.decode_raw(features['label_raw'], tf.uint8)
label.set_shape(params.NUM_CLASSES)

and that's working fine. Now I want to do the same with my array "map" being a float numpy array, instead of uint8, and I could not find examples on how to do it; I tried the function _floats_feature, which works if I pass a scalar to it, but not with arrays; with uint8 the serialization can be done by the method tostring();

How can I serialize a float numpy array and how can I read that back?


FloatList and BytesList expect an iterable. So you need to pass it a list of floats. Remove the extra brackets in your _float_feature, ie

def _floats_feature(value):
  return tf.train.Feature(float_list=tf.train.FloatList(value=value))

numpy_arr = np.ones((3,)).astype(np.float)
example = tf.train.Example(features=tf.train.Features(feature={"bytes": _floats_feature(numpy_arr)}))
print(example)

features {
  feature {
    key: "bytes"
    value {
      float_list {
        value: 1.0
        value: 1.0
        value: 1.0
      }
    }
  }
}

TFRecord and tf.Example, Partner Program. Contents; Used in the notebooks; Attributes. TensorFlow � API � TensorFlow Core v2.2.0 � Python Attributes. value, repeated float value� Pre-trained models and datasets built by Google and the community


I will expand on the Yaroslav's answer.

Int64List, BytesList and FloatList expect an iterator of the underlying elements (repeated field). In your case you can use a list as an iterator.

You mentioned: it works if I pass a scalar to it, but not with arrays. And this is expected, because when you pass a scalar, your _floats_feature creates an array of one float element in it (exactly as expected). But when you pass an array you create a list of arrays and pass it to a function which expects a list of floats.

So just remove construction of the array from your function: float_list=tf.train.FloatList(value=value)

tf.train.FloatList, import numpy as np. import tensorflow as tf Converts a Numpy array (or two Numpy arrays) into a tfrecord file. If input type is not float (64 or 32) or int. """. To read data efficiently it can be helpful to serialize your data and store it in a set of files (100-200MB each) that can each be read linearly. This is especially true if the data is being streamed over a network. This can also be useful for caching any data-preprocessing. The TFRecord format is a


Yaroslav's example failed when a nd array was the input:

numpy_arr = np.ones((3,3)).astype(np.float)

I found that it worked when I used numpy_arr.ravel() as the input. But is there a better way to do it?

From numpy ndarray to tfrecords � GitHub, Save numpy float array to tfrecords. numpy_to_tfrecord.py. from __future__ import print_function. import numpy as np. import tensorflow as tf. # A numpy array to� The following are 40 code examples for showing how to use numpy.record().They are from open source Python projects. You can vote up the examples you like or vote down the ones you don't like.


I've stumbled across this while working on a similar problem. Since part of the original question was how to read back the float32 feature from tfrecords, I'll leave this here in case it helps anyone:

If map.ravel() was used to input map of dimensions [x, y, z] into _floats_feature:

features = {
    ...
    'map': tf.FixedLenFeature([x, y, z], dtype=tf.float32)
    ...
}
parsed_example = tf.parse_single_example(serialized=serialized, features=features)
map = parsed_example['map']

Save numpy float array to tfrecords. � GitHub, float 타입 변수 tfrecord로 저장하기 ** Tensorflow.org recommends saving your training data in the format of tfrecord. Many examples can be� The tf.data API enables you to build complex input pipelines from simple, reusable pieces. For example, the pipeline for an image model might aggregate data from files in a distributed file system, apply random perturbations to each image, and merge randomly selected images into a batch for training.


[tensorflow] how to store/save/read float type numpy array as tfrecord, #Save numpy data to tfrecord import numpy as np import tensorflow as tf # Generate Test data # a float array and int array each with shape (2,2� I want to load these into TensorFlow so I can classify them using a neural network. How can this be done? What shape do the numpy arrays need to have? Additional Info - My images are 60 (height) by 160 (width) pixels each and each of them have 5 alphanumeric characters. Here is a sample image: Each label is a 5 by 62 array.


How to save numpy array to tfrecord and load via TFSlim dataset , !pip install tensorflow==2.0.0-beta1import tensorflow as tf In order to convert a data point, a single row in your data, into a tf.train. feature2 = np.random.choice (strings, n_observations)# Non-scalar Float feature, 2x2 matrices Tensor: id= 22481, shape=(2, 2), dtype=float64, numpy= array([[ 0.60822147,� You can't store an n-dimensional array as a float feature as float features are simple lists. You have to flatten prices into a list by doing prices.tolist(). If you need to recover the n-dimensional array from the flattened float feature, then you can do prices = np.reshape(float_feature, original_shape).


Working with TFRecords and tf.train.Example, As previously discussed, TensorFlow uses tensor data structure to represent all data. In this example, we build two numpy arrays, and convert them to tensors: platform-neutral, extensible mechanism for serializing structured data. node { name: "zeros" op: "Const" attr { key: "dtype" value { type: DT_FLOAT } } attr { key:� Note: This API is new and only available in tf-nightly. View source on GitHub Equivalent of numpy.ndarray backed by TensorFlow tensors. tf.experimental.numpy.ndarray( shape, dtype=float, buffer=None ) This does not support all features of NumPy ndarrays e.g. strides and memory order since, unlike