tensorflow-datasets – Page 2

tf.data.Dataset: how to get the dataset size (number of elements in an epoch)?

March 9, 2023 by Tarik

len(list(dataset)) works in eager mode, although that’s obviously not a good general solution.

Split a dataset created by Tensorflow dataset API in to Train and Test?

February 25, 2023 by Tarik

Assuming you have all_dataset variable of tf.data.Dataset type: test_dataset = all_dataset.take(1000) train_dataset = all_dataset.skip(1000) Test dataset now has first 1000 elements and the rest goes for training.

What is the difference between Dataset.from_tensors and Dataset.from_tensor_slices?

January 21, 2023 by Tarik

from_tensors combines the input and returns a dataset with a single element: >>> t = tf.constant([[1, 2], [3, 4]]) >>> ds = tf.data.Dataset.from_tensors(t) >>> [x for x in ds] [<tf.Tensor: shape=(2, 2), dtype=int32, numpy= array([[1, 2], [3, 4]], dtype=int32)>] from_tensor_slices creates a dataset with a separate element for each row of the input tensor: >>> … Read more

Meaning of buffer_size in Dataset.map , Dataset.prefetch and Dataset.shuffle

December 15, 2022 by Tarik

TL;DR Despite their similar names, these arguments have quite difference meanings. The buffer_size in Dataset.shuffle() can affect the randomness of your dataset, and hence the order in which elements are produced. The buffer_size in Dataset.prefetch() only affects the time it takes to produce the next element. The buffer_size argument in tf.data.Dataset.prefetch() and the output_buffer_size argument … Read more