Pipeline for NCSN¶
NCSNPipeline
is a pipeline for training and inference of Noise Conditional Score Networks (NCSN) proposed by by Yang Song and Stefano Ermon in the paper Generative Modeling by Estimating Gradients of the Data Distribution. The pipeline is designed to be used with the
UNet2DModelForNCSN
model and the AnnealedLangevinDynamicsScheduler
scheduler.
The abstract of the paper is the following:
We introduce a new generative model where samples are produced via Langevin dynamics using gradients of the data distribution estimated with score matching. Because gradients can be ill-defined and hard to estimate when the data resides on low-dimensional manifolds, we perturb the data with different levels of Gaussian noise, and jointly estimate the corresponding scores, i.e., the vector fields of gradients of the perturbed data distribution for all noise levels. For sampling, we propose an annealed Langevin dynamics where we use gradients corresponding to gradually decreasing noise levels as the sampling process gets closer to the data manifold. Our framework allows flexible model architectures, requires no sampling during training or the use of adversarial methods, and provides a learning objective that can be used for principled model comparisons. Our models produce samples comparable to GANs on MNIST, CelebA and CIFAR-10 datasets, achieving a new state-of-the-art inception score of 8.87 on CIFAR-10. Additionally, we demonstrate that our models learn effective representations via image inpainting experiments.
NCSNPipeline¶
- class ncsn.pipeline_ncsn.NCSNPipeline(unet, scheduler)[source]¶
Pipeline for unconditional image generation using Noise Conditional Score Network (NCSN).
This model inherits from
DiffusionPipeline
. Check the superclass documentation for the generic methods implemented for all pipelines (downloading, saving, running on a particular device, etc.).- Parameters:
unet (
UNet2DModelForNCSN
) – A UNet2DModelForNCSN to estimate the score of the image.scheduler (
AnnealedLangevinDynamicsScheduler
) – A AnnealedLangevinDynamicsScheduler to be used in combination with unet to estimate the score of the image.
- __call__(batch_size=1, num_inference_steps=10, generator=None, output_type='pil', return_dict=True, callback_on_step_end=None, callback_on_step_end_tensor_inputs=None, **kwargs)[source]¶
The call function to the pipeline for generation.
- Parameters:
batch_size (int, optional, defaults to 1) – The number of images to generate.
num_inference_steps (int, optional, defaults to 10) – The number of inference steps.
generator (torch.Generator, optional) – A
torch.Generator
to make generation deterministic.output_type (str, optional, defaults to “pil”) – The output format of the generated image. Choose between PIL.Image or np.array.
return_dict (bool, optional, defaults to True) – Whether or not to return a [ImagePipelineOutput] instead of a plain tuple.
callback_on_step_end (Callable, PipelineCallback, MultiPipelineCallbacks, optional) – A function or a subclass of PipelineCallback or MultiPipelineCallbacks that is called at the end of each denoising step during the inference. with the following arguments: callback_on_step_end(self: DiffusionPipeline, step: int, timestep: int, callback_kwargs: Dict). callback_kwargs will include a list of all tensors as specified by callback_on_step_end_tensor_inputs.
callback_on_step_end_tensor_inputs (List, optional) – The list of tensor inputs for the callback_on_step_end function. The tensors specified in the list will be passed as callback_kwargs argument. You will only be able to include variables listed in the ._callback_tensor_inputs attribute of your pipeline class.
- Returns:
If return_dict is True,
diffusers.ImagePipelineOutput
is returned, otherwise a tuple is returned where the first element is a list with the generated images.- Return type:
diffusers.ImagePipelineOutput
or tuple
- decode_samples(samples)[source]¶
Decodes the generated samples to the correct format suitable for images.
- Parameters:
samples (
torch.Tensor
) – The generated samples to decode.- Returns:
The decoded samples.
- Return type: