Visualizing images between text prompts

Even though StableDiffusion has been available for a couple of years, I still find diffusion models to be fascinating. I won’t go into too much detail on how they work, as others have already covered it extensively and in greater depth (further reading + code used is at the end of this post).

A quick summary - when a prompted with some text, a diffusion model starts with an image containing white noise, and then works its way backwards by removing the noise, ultimately resulting in an image that best represents the text prompt. The same process can be used to create images that depict the transition from one text prompt to another, as shown in the images below.

Prompt 1

This gif was created using the 2 prompts below, and then using stable diffusion magic (interpolation), images are generated from the transition between the first prompt and the second, those images are then stitched together into a .gif file.

  • A watercolor painting of a Golden Retriever at the beach
  • A still life DSLR photo of a bowl of fruit

Prompt 2

How about a creepier variation of the same prompt?

  • A watercolor painting of a Chesapeake Bay Retriever at the beach
  • An alien in medieval armour, M. C. Escher, Hyperrealist

Prompt 3

What about using 4 prompts (crazy)? Same idea as before but using the 4 prompts below:

  • A watercolor painting of a Golden Retriever at the beach
  • A still life DSLR photo of a bowl of fruit
  • The eiffel tower in the style of starry night
  • An architectural sketch of a skyscraper

Code

These images were creating using Stable Diffusion 2.1, see code used in this Jupyter Notebook.

Additional reading

  • A very detailed blog post about diffusion processes, the architecture of diffusion models and examples (link)
  • The illustrated Stable Diffusion - by Jay Alammar (link)