modelzoo.vision.pytorch.dit.layers.GaussianDiffusion.GaussianDiffusion#

class modelzoo.vision.pytorch.dit.layers.GaussianDiffusion.GaussianDiffusion[source]#

Bases: torch.nn.Module

Generate noisy images via Gaussian diffusion. The class implements the noising process as described in Step 5 of Algorithm 1 in the paper “Denoising Diffusion Probabilistic Models <https://arxiv.org/abs/2006.11239>`.

Parameters
  • num_diffusion_steps ((int)) – Number of diffusion steps.

  • beta_start ((float)) – Minimum variance for generated Gaussian noise.

  • beta_end ((float)) – Maximum variance for generated Gaussian noise.

  • seed ((int)) – Random seed for reproducibility.

  • beta_start – Initial value of variance schedule i.e beta_1 (default value according to Ho et al https://arxiv.org/pdf/2006.11239.pdf: Section 4)

  • beta_end – Final value of variance schedule i.e beta_T (default value according to Ho et al https://arxiv.org/pdf/2006.11239.pdf: Section 4)

Methods

forward

Lookup alpha-related constants and create noised sample :param : param latent (Tensor): Float tensor of size (B, C, H, W).

__call__(*args: Any, **kwargs: Any) Any#

Call self as a function.

__init__(num_diffusion_steps, schedule_name, seed=None, beta_start=0.0001, beta_end=0.02)[source]#
Parameters
  • num_diffusion_steps ((int)) – Number of diffusion steps.

  • beta_start ((float)) – Minimum variance for generated Gaussian noise.

  • beta_end ((float)) – Maximum variance for generated Gaussian noise.

  • seed ((int)) – Random seed for reproducibility.

  • beta_start – Initial value of variance schedule i.e beta_1 (default value according to Ho et al https://arxiv.org/pdf/2006.11239.pdf: Section 4)

  • beta_end – Final value of variance schedule i.e beta_T (default value according to Ho et al https://arxiv.org/pdf/2006.11239.pdf: Section 4)

static __new__(cls, *args: Any, **kwargs: Any) Any#
forward(latent, noise, timestep)[source]#

Lookup alpha-related constants and create noised sample :param : param latent (Tensor): Float tensor of size (B, C, H, W).

Returns

A tuple corresponding to the noisy images, ground truth noises and the timesteps corresponding to the scheduled noise variance.