What is the difference between Dataset.from_tensors and Dataset.from_tensor_slices?

I have a dataset represented as a NumPy matrix of shape `(num_features, num_examples)` and I wish to convert it to TensorFlow type `tf.Dataset`.

I am struggling trying to understand the difference between these two methods: `Dataset.from_tensors` and `Dataset.from_tensor_slices`. What is the right one and why?

TensorFlow documentation (link) says that both method accept a nested structure of tensor although when using `from_tensor_slices` the tensor should have same size in the 0-th dimension.

`from_tensors` combines the input and returns a dataset with a single element:

```t = tf.constant([[1, 2], [3, 4]])
ds = tf.data.Dataset.from_tensors(t)   # [[1, 2], [3, 4]]
```

`from_tensor_slices` creates a dataset with a separate element for each row of the input tensor:

```t = tf.constant([[1, 2], [3, 4]])
ds = tf.data.Dataset.from_tensor_slices(t)   # [1, 2], [3, 4]
```

Better documentation for Dataset.from_tensors/from_tensor_slices , While following Google's ML crash course, I found it very difficult to understand the difference between Dataset.from_tensors/from_tensor_slices  Syntax : tf.data.Dataset.from_tensor_slices(list) Return : Return the objects of sliced elements. Example #1 : In this example we can see that by using tf.data.Dataset.from_tensor_slices() method, we are able to get the slices of list or array.

1) Main difference between the two is that nested elements in `from_tensor_slices` must have the same dimension in 0th rank:

```# exception: ValueError: Dimensions 10 and 9 are not compatible
dataset1 = tf.data.Dataset.from_tensor_slices(
(tf.random_uniform([10, 4]), tf.random_uniform([9])))
# OK, first dimension is same
dataset2 = tf.data.Dataset.from_tensors(
(tf.random_uniform([10, 4]), tf.random_uniform([10])))
```

2) The second difference, explained here, is when the input to a tf.Dataset is a list. For example:

```dataset1 = tf.data.Dataset.from_tensor_slices(
[tf.random_uniform([2, 3]), tf.random_uniform([2, 3])])

dataset2 = tf.data.Dataset.from_tensors(
[tf.random_uniform([2, 3]), tf.random_uniform([2, 3])])

print(dataset1) # shapes: (2, 3)
print(dataset2) # shapes: (2, 2, 3)
```

In the above, `from_tensors` creates a 3D tensor while `from_tensor_slices` merge the input tensor. This can be handy if you have different sources of different image channels and want to concatenate them into a one RGB image tensor.

3) A mentioned in the previous answer, `from_tensors` convert the input tensor into one big tensor:

```import tensorflow as tf

tf.enable_eager_execution()

dataset1 = tf.data.Dataset.from_tensor_slices(
(tf.random_uniform([4, 2]), tf.random_uniform([4])))

dataset2 = tf.data.Dataset.from_tensors(
(tf.random_uniform([4, 2]), tf.random_uniform([4])))

for i, item in enumerate(dataset1):
print('element: ' + str(i + 1), item[0], item[1])

print(30*'-')

for i, item in enumerate(dataset2):
print('element: ' + str(i + 1), item[0], item[1])
```

output:

```element: 1 tf.Tensor(... shapes: ((2,), ()))
element: 2 tf.Tensor(... shapes: ((2,), ()))
element: 3 tf.Tensor(... shapes: ((2,), ()))
element: 4 tf.Tensor(... shapes: ((2,), ()))
-------------------------
element: 1 tf.Tensor(... shapes: ((4, 2), (4,)))
```

tf.data.Dataset, format is a simple format for storing a sequence of binary records. Protocol buffers are a cross-platform, cross-language library for efficient serialization of structured data. Protocol messages are defined by . proto files, these are often the easiest way to understand a message type. Dismiss Join GitHub today. GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.

Try this :

```import tensorflow as tf  # 1.13.1
tf.enable_eager_execution()

t1 = tf.constant([[11, 22], [33, 44], [55, 66]])

print("\n=========     from_tensors     ===========")
ds = tf.data.Dataset.from_tensors(t1)
print(ds.output_types, end=' : ')
print(ds.output_shapes)
for e in ds:
print (e)

print("\n=========   from_tensor_slices    ===========")
ds = tf.data.Dataset.from_tensor_slices(t1)
print(ds.output_types, end=' : ')
print(ds.output_shapes)
for e in ds:
print (e)
```

output :

```=========      from_tensors    ===========
<dtype: 'int32'> : (3, 2)
tf.Tensor(
[[11 22]
[33 44]
[55 66]], shape=(3, 2), dtype=int32)

=========   from_tensor_slices      ===========
<dtype: 'int32'> : (2,)
tf.Tensor([11 22], shape=(2,), dtype=int32)
tf.Tensor([33 44], shape=(2,), dtype=int32)
tf.Tensor([55 66], shape=(2,), dtype=int32)
```

The output is pretty much self-explanatory but as you can see, from_tensor_slices() slices the output of (what would be the output of) from_tensors() on its first dimension. You can also try with :

```t1 = tf.constant([[[11, 22], [33, 44], [55, 66]],
[[110, 220], [330, 440], [550, 660]]])
```

TensorFlow Datasets, We can, of course, initialise our dataset with some tensor # using a tensor dataset = tf.data.Dataset.from_tensor_slices(tf.random_uniform([100,  Also using tf.data.Dataset.from_tensors is no option. Even though this seems to handle tuples properly, one only gets a single element instead of n elements. This is in alignment with the documentation but does not fulfill the same functionality as tf.data.Dataset.from_tensor_slices .

TFRecord and tf.Example, Dataset.from_tensors() to combine the input, otherwise use tf.data.Dataset.​from_tensor_slices() if you want a separate row for each input tensor. The difference between the first two APIs is shown as follows: #combine the input into one  Create a source dataset using one of the factory functions like Dataset.from_tensors, Dataset.from_tensor_slices, or using objects that read from files like TextLineDataset or TFRecordDataset. See the TensorFlow Dataset guide for more information.

tf.data: Build TensorFlow input pipelines, Functions TABLE 10-1 That Create Datasets Member Description range(*args) Dataset.range(2, 8, 2) # [2, 4, 6] The from_tensors and from_tensor_slices are  For example, to construct a Dataset from data in memory, you can use tf.data.Dataset.from_tensors() or tf.data.Dataset.from_tensor_slices(). Alternatively, if your input data is stored in a file in the recommended TFRecord format, you can use tf.data.TFRecordDataset().

How to use Dataset in TensorFlow, Use the new and improved features of TensorFlow to enhance machine learning and deep learning Ajay Baranwal, Alizishaan Khatri, Dataset.from_tensors((​features, labels)) from_tensor_slices(. Dataset using all the different file formats​. Load data using tf.data.Dataset. Use tf.data.Dataset.from_tensor_slices to read the values from a pandas dataframe.. One of the advantages of using tf.data.Dataset is it allows you to write simple, highly efficient data pipelines.

TensorFlow 2 Pocket Primer, Having efficient data pipelines is of paramount importance for any machine from_tensors: It also accepts single or multiple numpy arrays or tensors. Dataset.from_tensor_slices(data).batch(10)# creates the iterator to Feedable iterator: Can be used to switch between Iterators for different Datasets. train_dataset = tf.data.Dataset.from_tensor_slices((x,y)) test_dataset = tf.data.Dataset.from_tensor_slices((x,y)) One for training and one for testing. Then, we can create our iterator, in this case we use the initializable iterator, but you can also use a one shot iterator