Keras2 ImageDataGenerator or TensorFlow tf.data?

With Keras2 being implemented into TensorFlow and TensorFlow 2.0 on the horizon, should you use Keras ImageDataGenerator with e.g, flow_from_directory or tf.data from TensorFlow which also can be used with fit_genearator of Keras now?

Will both methods will have their place by serving a different purpose or will tf.data be the new way to go and Keras generators deprecated in the future?

Thanks, I would like to take the path which keeps me up to date a bit longer in this fast moving field.

Since its release, TensorFlow Dataset API is a default recommended way to construct input pipeline for any model build on TensorFlow backend, both Keras and low-level TensorFlow. In later versions of TF 1.xx it can be directly used in tf.keras.Model.fit method as

model.fit(dataset, epochs)

It's good both for rapid prototyping,

dataset = tf.data.Dataset.from_tensor_slices((train, test))
dataset = dataset.shuffle().repeat().batch()

And for building complex, high performance ETL pipelines 4. Upgrade your data input pipelines, more on this here https://www.tensorflow.org/guide/performance/datasets

As per official docs, in TF 2.0 it'll also be the default way to input data to the model. https://www.tensorflow.org/alpha/guide/migration_guide

As by default, upcoming TensorFlow version will be executed eagerly, dataset object will become iterable and will be even easier to use.

Load images, I'm continuing to take notes about my mistakes/difficulties using TensorFlow. I had Keras ImageDataGenerator that I wanted to wrap as a tf.data. I’m continuing to take notes about my mistakes/difficulties using TensorFlow. I had Keras ImageDataGenerator that I wanted to wrap as a tf.data.Dataset. I couldn’t adapt the documentation to my own use case. Here is a concrete example for image classification.

Alongside custom defined Python generators, you can wrap the ImageDataGenerator from Keras inside tf.data.

The following snippets are taken from the TensorFlow 2.0 documentation.

img_gen = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255, rotation_range=20)
ds = tf.data.Dataset.from_generator(
    img_gen.flow_from_directory, args=[flowers], 
    output_types=(tf.float32, tf.float32), 
    output_shapes = ([32,256,256,3],[32,5])
)

Therefore, one can still use the typical Keras ImageDataGenerator, you just need to wrap it into a tf.data.Dataset like above.

Keras2 ImageDataGenerator or TensorFlow tf.data?, Keras's ImageDataGenerator is easy to use but reading image is slower that it is being used by tensorflow as in-built pipeline for large data. Pre-trained models and datasets built by Google and the community

For me, I prefer to build a generator with yield:

def generator(batch_size=4,path):
imgs=glob(path+'*.jpg')
while True:
    batch=[]
    for i in range(batch_size):
        idx=np.random.randint(0,len(imgs))
        img=cv.resize(cv.imread(imgs[idx]),(256,256))/255
        batch.append(img)
    batch=np.array(batch)
    yield batch

Then create the generator and input it to model.fit_generator, it will work.

You can choose data randomly like this or use some recurrent methods.

Though the code is rough, it is easy to change so that it can generate complex batch.

Note that this is a way to generate for TF 1.X with Keras2 and not with TensorFlow 2.0.

Keras ImageDataGenerator and tf.Data.Dataset in TensorFlow 2.0 , ImageDataGenerator and tf.datasets: model.fit() is running infinitely #39277. Closed from tensorflow.keras.preprocessing.image import ImageDataGenerator ravikyram added comp:data comp:keras TF 2.2 labels on May 8. Generate batches of tensor image data with real-time data augmentation. View aliases. Compat aliases for migration. See Migration guide for more details. tf.compat.v1.keras.prepro

ImageDataGenerator vs Tf.data.Dataset, what's the difference and , Dataset.from_generator along with Keras ImageDataGenerator for with tf 2.1.0 to get the example at https://www.tensorflow.org/guide/data to  Class ImageDataGenerator. Defined in tensorflow/python/keras/_impl/keras/preprocessing/image.py.. Generate minibatches of image data with real-time data augmentation.

ImageAugmentation using tf.keras.preprocessing.image , In keras with None batch size i get error, in tensorflow dataset without batch or with test_generator def train_input_fn(batch_size=batch_size): dataset = tf.data​. Both batch_x and batch_y in ImageDataGenerator are of type K.floatx(), so must be tf.float32 by default.. Similar question was discussed already at How to use Keras generator with tf.data API.

Using tf.data.Dataset.from_generator with tf.keras · Issue #33535 , In keras this is achieved by utilizing the ImageDataGenerator class. bit per channel rotation_range=30, # The image data generator offers a lot of convinience features def create_dataset(files, batch_size): dataset = tf.data. One possibility is to join three ImageDataGenerator into one, using class_mode=None (so they don't return any target), and using shuffle=False (important). Make sure you're using the same batch_size for each and make sure each input is in a different dir, and the targets also in a different dir, and that there are exactly the same number of images in each directory.

Comments
  • Thank you sharky, also for the edit of my post. So it can be said that keras Generators will be replaced by tf.data, correct? Cheers
  • Although we can't say this for sure, I think that it will be eventually replaced. Keras preprocessing is a little outdated. It was built as a simpler alternative to TF's queue runners. But now tf.data is just as simple to use, and has more potential functionality. So if you know how to use tf.data, there's no need to use other methods
  • Cheers. One additional question, I know from Keras that you can easily import data from folder structures. Everything is kind of automated. I looked into tf.data and it seem much more manual effort. Did I just not look correctly or are the tf functions less automated? thanks again for your explanations
  • It's easy. from_tensor_slices can accept list of filenames as argument. then just apply dataset.map with any preprocessing you need. See my other answer stackoverflow.com/questions/55332476/…
  • Sharky you are the best. Thanks for your help
  • This is really the best of both world!
  • Yes, I am also of the same opinion. At least the swicth to TF 2.0 is done step by step.
  • Dear Calin, that is great news. I think they noticed how valuable ImageDataGenerator was/is. Thanks
  • For TF 2.X, just wrap your generator with Dataset.from_generator (tensorflow.org/api_docs/python/tf/data/Dataset#from_generator)