On This Page
When sharding data in TensorFlow, follow below guidelines for data shuffling:
shuffle_bufferof size greater than the size of the dataset.
In a multi-worker scenario with sharding, each worker has access to
1/num_shardssubset of dataset. Theoretically in this case, the
shuffle_bufferused should be greater than
(1/num_shards) * dataset_size.
While large buffer sizes help shuffle the data more thoroughly, they can take a lot of memory and significant time to fill. A practical value for
10 * batch_size.
Introduce randomness into the data loading pipeline by shuffling the data when writing into multiple files, splitting dataset across multiple workers by sharding, interleaving and map with parallel calls and shuffling with a decent sized
Changes in TensorFlow 1.14+¶
TensorFlow 1.14 includes significant changes in preparation for transition to TensorFlow 2.0. These changes include:
Keras layers are the recommended way to build your model.
Mixed precision is now a first-class feature through Keras mixed precision policy.
Significant portions of
tf.contribhave been removed in favor of the officially integrated alternatives.
Update to TensorFlow 1.14+¶
To update your code for use with TensorFlow 1.14+ and avoid using deprecated features, you should:
Use Keras mixed precision policy to specify running your model in mixed precision.
Make changes to your model and input functions to remove deprecation warnings seen during execution. The warning will indicate the replacement you should make. This usually involves updating a function to use an alternative from
tf.compat.v1. See an example warning below:
WARNING:tensorflow:From onlinenorm_test.py:67: The name
tf.logging.infois deprecated. Please use
Run the model first in
--mode validate_only while removing the deprecation warnings. This will skip latter compilation stages and will speed up the iteration.
tf.contrib. This has been entirely deprecated and does not exist in TensorFlow 2.x. Searching for the exact function you are currently using should allow you to find a suitable replacement.
Finally, make sure to follow documentation for TensorFlow 1.15.