common.tf.hooks package#

Submodules#

common.tf.hooks.grad_accum_hooks module#

class common.tf.hooks.grad_accum_hooks.GradAccumLoggingTensorHook#

Bases: tensorflow.python.training.basic_session_run_hooks.LoggingTensorHook

Prints the given tensors every N steps, every N seconds, or at end.

The tensors will be printed to the log, with INFO severity. If you are not seeing the logs, you might want to add the following line after your imports:

```python

tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.INFO)

```

Note that if at_end is True, tensors should not include any tensor whose evaluation produces a side effect such as consuming additional inputs.

Initializes a GradAccumLoggingTensorHook.

Parameters
  • trainer (Trainer) – common.optimizers.Trainer object used for model training with gradient accumulation.

  • tensors (dict) – dict that maps string-valued tags to tensors/tensor names.

  • every_n_steps (int) – Print the values of tensors once every N steps taken on the current worker.

  • every_n_secs (int) – Print the values of tensors once every N seconds. Exactly one of every_n_steps and every_n_secs should be provided.

  • at_end (bool) – Specify whether to print the values of tensors at the end of the run.

  • formatter (function) – Takes dict of tag->`Tensor` and returns a string. If None uses default printing all tensors.

__init__(trainer, tensors, every_n_steps=None, every_n_secs=None, at_end=False, formatter=None)#

Initializes a GradAccumLoggingTensorHook.

Parameters
  • trainer (Trainer) – common.optimizers.Trainer object used for model training with gradient accumulation.

  • tensors (dict) – dict that maps string-valued tags to tensors/tensor names.

  • every_n_steps (int) – Print the values of tensors once every N steps taken on the current worker.

  • every_n_secs (int) – Print the values of tensors once every N seconds. Exactly one of every_n_steps and every_n_secs should be provided.

  • at_end (bool) – Specify whether to print the values of tensors at the end of the run.

  • formatter (function) – Takes dict of tag->`Tensor` and returns a string. If None uses default printing all tensors.

after_run(run_context, run_values)#
before_run(run_context)#
begin()#
end(session)#
class common.tf.hooks.grad_accum_hooks.GradAccumStepCounterHook#

Bases: tensorflow.python.training.session_run_hook.SessionRunHook

Hook that counts and plots steps per second.

Initializes a GradAccumStepCounterHook.

Parameters
  • trainer (Trainer) – common.optimizers.Trainer object used for model training with gradient accumulation.

  • every_n_steps (int) – every N steps. Exactly one of every_n_secs and every_n_steps should be set.

  • every_n_secs (int) – Log every N seconds.

  • output_dir (string) – The directory to save the summaries to. Only used if no summary_writer is supplied.

  • summary_writer (SummaryWriter) – If None and an output_dir was passed, one will be created accordingly.

__init__(trainer, every_n_steps=100, every_n_secs=None, output_dir=None, summary_writer=None)#

Initializes a GradAccumStepCounterHook.

Parameters
  • trainer (Trainer) – common.optimizers.Trainer object used for model training with gradient accumulation.

  • every_n_steps (int) – every N steps. Exactly one of every_n_secs and every_n_steps should be set.

  • every_n_secs (int) – Log every N seconds.

  • output_dir (string) – The directory to save the summaries to. Only used if no summary_writer is supplied.

  • summary_writer (SummaryWriter) – If None and an output_dir was passed, one will be created accordingly.

after_run(run_context, run_values)#
begin()#
end(session=None)#
class common.tf.hooks.grad_accum_hooks.GradAccumSummarySaverHook#

Bases: tensorflow.python.training.session_run_hook.SessionRunHook

Saves summaries every N steps, where N is the number of effective batches seen by an optimizer in the gradient accumulation mode.

Initializes a GradAccumSummarySaverHook.

Parameters
  • trainer (Trainer) – common.optimizers.Trainer object used for model training with gradient accumulation.

  • tensors (dict) – dict that maps string-valued tags to tensors/tensor names.

  • save_steps (int) – Save summaries every N steps. Exactly one of save_secs and save_steps should be set.

  • save_secs (int) – Save summaries every N seconds.

  • output_dir (string) – The directory to save the summaries to. Only used if no summary_writer is supplied.

  • summary_writer (SummaryWriter) – If None and an output_dir was passed, one will be created accordingly.

__init__(trainer, tensors, save_steps=None, save_secs=None, output_dir=None, summary_writer=None)#

Initializes a GradAccumSummarySaverHook.

Parameters
  • trainer (Trainer) – common.optimizers.Trainer object used for model training with gradient accumulation.

  • tensors (dict) – dict that maps string-valued tags to tensors/tensor names.

  • save_steps (int) – Save summaries every N steps. Exactly one of save_secs and save_steps should be set.

  • save_secs (int) – Save summaries every N seconds.

  • output_dir (string) – The directory to save the summaries to. Only used if no summary_writer is supplied.

  • summary_writer (SummaryWriter) – If None and an output_dir was passed, one will be created accordingly.

after_run(run_context, run_values)#
before_run(run_context)#
begin()#
end(session=None)#
common.tf.hooks.grad_accum_hooks.get_grad_accum_hooks(trainer, runconfig_params, summary_dict=None, logging_dict=None)#

Initializes a GradAccumLoggingTensorHook.

Parameters
  • trainer (Trainer) – common.optimizers.Trainer object used for model training with gradient accumulation.

  • runconfig_params (dict) – Runtime configs dictionary.

  • summary_dict (dict) – Dictionary with values containing tensors to be written into summaries and keys containing summary names, e.g., {“train/total_loss”, total_loss}. In case of distributed training, the tensors will be mean reduced accross all replicas.

  • logging_dict (dict) – Dictionary with values containing tensors to be logged and keys containing the log names, e.g., {“loss”, total_loss} will log “loss = <total_loss_value>, step = <global_step>”

Module contents#