.. _early-stopping:

Early Stopping
==============

Using a custom hook called ``CerebrasEarlyStoppingHook`` you can terminate early a neural network training based on some logic. This hook is similar to the `Keras EarlyStopping class <https://keras.io/api/callbacks/early_stopping/>`__.  The ``CerebrasEarlyStoppingHook`` can be used in Tensorflow either on the CS system or on a CPU.

.. important::

	Early stopping with ``CerebrasEarlyStoppingHook`` is currently supported only on the data accessible by the ``model_fn``. This means that if you are running training, ``CerebrasEarlyStoppingHook`` will only compute the stopping condition based on the data provided for the training run. If you are running evaluation, ``CerebrasEarlyStoppingHook`` will only compute the stopping condition based on the validation data.

Example
-------

See the following Tensorflow example.

.. code-block:: python

    def acc_early_stop(logits, labels):
        train_acc = tf.compat.v1.metrics.accuracy(
            tf.argmax(labels, 1), tf.argmax(logits, 1)
        )
        # Return True if training accuracy is greater than 90%.
    return tf.math.greater(train_acc[0], tf.constant(0.9))

    def loss_early_stop(loss, threshold):
        # Return True if training loss is lower than threshold.
        return tf.math.less(train_acc, tf.constant(threshold))

    def model_fn(features, labels, mode, params):
        ...
        # Specify the model.
        ...
        training_hooks = [
            # Check acc_early_stop every 1000th iteration and stop training if True.
            CerebrasEarlyStoppingHook(acc_early_stop, [logits, labels], every_n_iter=1000),
            # Check loss_early_stop every 500th iteration and stop training if True.
            CerebrasEarlyStoppingHook(loss_early_stop, [loss, 0.01], every_n_iter=500)
        ]
        ...
        spec = CSEstimatorSpec(
            ...
            training_hooks=training_hooks
            ...
        )
        return spec

In the above example, the function ``acc_early_stop`` returns ``True`` if the training accuracy is greater than 90%, and the function ``loss_early_stop`` returns ``True`` if the training loss is lower than the ``threshold`` argument.

The first ``CerebrasEarlyStoppingHook`` in the ``training_hooks`` list evaluates the ``acc_early_stop`` function once every 1000 iterations. If ``acc_early_stop`` function evaluates to ``True``, then training is stopped. If the training accuracy is not greater than 90% then ``acc_early_stop`` function is evaluated at the next 1000th iteration.

Similarly the second ``CerebrasEarlyStoppingHook`` evaluates ``loss_early_stop`` function every 500th iteration and stops the training if ``True``.

.. note::

	A function like ``acc_early_stop`` or ``loss_early_stop`` must return a 0 rank Boolean tensor. There are no other restrictions on the computation that occurs inside such a function. This function runs on the host.