Back to previous page

Images from words

by

Our Take

Ever wonder how AI turns "dog with a blue beret" into an actual image? I did - and researched a bit. Here's the breakdown of what (I understand>biased view) is happening behind the scenes, minus the heavy math.

Topics

This resource is for

basic flow

  • text goes in through a frozen text encoder

  • gets turned into something the model can understand (embeddings)

  • diffusion model starts with noise and removes it bit by bit

  • super-resolution models clean it up and make it bigger

  • final image comes out

stack

  • text-to-image models (like Imagen, Parti, Stable Diffusion)

  • CLIP for understanding what words mean visually

  • transformer architecture handling the heavy lifting

  • upscalers making small images big

  • VQ-GAN for handling the image parts

resources

  • Jay Alammar's blog - visual explanations that actually make sense

  • Stanford's CS231n course - fundamentals of computer vision

  • Hugging Face diffusion course - hands-on with actual models

  • AssemblyAI YouTube series on diffusion models

  • Andrej Karpathy's neural nets course

  • Keras examples of implementing basic models

  • Papers:

    • Imagen paper (Google)

    • Parti paper (scaling study)

    • Stable Diffusion paper (for the open source angle)

basic flow

  • text goes in through a frozen text encoder

  • gets turned into something the model can understand (embeddings)

  • diffusion model starts with noise and removes it bit by bit

  • super-resolution models clean it up and make it bigger

  • final image comes out

stack

  • text-to-image models (like Imagen, Parti, Stable Diffusion)

  • CLIP for understanding what words mean visually

  • transformer architecture handling the heavy lifting

  • upscalers making small images big

  • VQ-GAN for handling the image parts

resources

  • Jay Alammar's blog - visual explanations that actually make sense

  • Stanford's CS231n course - fundamentals of computer vision

  • Hugging Face diffusion course - hands-on with actual models

  • AssemblyAI YouTube series on diffusion models

  • Andrej Karpathy's neural nets course

  • Keras examples of implementing basic models

  • Papers:

    • Imagen paper (Google)

    • Parti paper (scaling study)

    • Stable Diffusion paper (for the open source angle)

basic flow

  • text goes in through a frozen text encoder

  • gets turned into something the model can understand (embeddings)

  • diffusion model starts with noise and removes it bit by bit

  • super-resolution models clean it up and make it bigger

  • final image comes out

stack

  • text-to-image models (like Imagen, Parti, Stable Diffusion)

  • CLIP for understanding what words mean visually

  • transformer architecture handling the heavy lifting

  • upscalers making small images big

  • VQ-GAN for handling the image parts

resources

  • Jay Alammar's blog - visual explanations that actually make sense

  • Stanford's CS231n course - fundamentals of computer vision

  • Hugging Face diffusion course - hands-on with actual models

  • AssemblyAI YouTube series on diffusion models

  • Andrej Karpathy's neural nets course

  • Keras examples of implementing basic models

  • Papers:

    • Imagen paper (Google)

    • Parti paper (scaling study)

    • Stable Diffusion paper (for the open source angle)

basic flow

  • text goes in through a frozen text encoder

  • gets turned into something the model can understand (embeddings)

  • diffusion model starts with noise and removes it bit by bit

  • super-resolution models clean it up and make it bigger

  • final image comes out

stack

  • text-to-image models (like Imagen, Parti, Stable Diffusion)

  • CLIP for understanding what words mean visually

  • transformer architecture handling the heavy lifting

  • upscalers making small images big

  • VQ-GAN for handling the image parts

resources

  • Jay Alammar's blog - visual explanations that actually make sense

  • Stanford's CS231n course - fundamentals of computer vision

  • Hugging Face diffusion course - hands-on with actual models

  • AssemblyAI YouTube series on diffusion models

  • Andrej Karpathy's neural nets course

  • Keras examples of implementing basic models

  • Papers:

    • Imagen paper (Google)

    • Parti paper (scaling study)

    • Stable Diffusion paper (for the open source angle)

basic flow

  • text goes in through a frozen text encoder

  • gets turned into something the model can understand (embeddings)

  • diffusion model starts with noise and removes it bit by bit

  • super-resolution models clean it up and make it bigger

  • final image comes out

stack

  • text-to-image models (like Imagen, Parti, Stable Diffusion)

  • CLIP for understanding what words mean visually

  • transformer architecture handling the heavy lifting

  • upscalers making small images big

  • VQ-GAN for handling the image parts

resources

  • Jay Alammar's blog - visual explanations that actually make sense

  • Stanford's CS231n course - fundamentals of computer vision

  • Hugging Face diffusion course - hands-on with actual models

  • AssemblyAI YouTube series on diffusion models

  • Andrej Karpathy's neural nets course

  • Keras examples of implementing basic models

  • Papers:

    • Imagen paper (Google)

    • Parti paper (scaling study)

    • Stable Diffusion paper (for the open source angle)

basic flow

  • text goes in through a frozen text encoder

  • gets turned into something the model can understand (embeddings)

  • diffusion model starts with noise and removes it bit by bit

  • super-resolution models clean it up and make it bigger

  • final image comes out

stack

  • text-to-image models (like Imagen, Parti, Stable Diffusion)

  • CLIP for understanding what words mean visually

  • transformer architecture handling the heavy lifting

  • upscalers making small images big

  • VQ-GAN for handling the image parts

resources

  • Jay Alammar's blog - visual explanations that actually make sense

  • Stanford's CS231n course - fundamentals of computer vision

  • Hugging Face diffusion course - hands-on with actual models

  • AssemblyAI YouTube series on diffusion models

  • Andrej Karpathy's neural nets course

  • Keras examples of implementing basic models

  • Papers:

    • Imagen paper (Google)

    • Parti paper (scaling study)

    • Stable Diffusion paper (for the open source angle)

Request a client account

Active Allsite clients receive a dedicated client account. Reach out to see if we currently have availability to take on new projects.

Frequent questions

with specific answers.

What we do

Who we work with

How to get started

What are terms

What services do you offer exactly?

Do you also do brand work?

Why Framer over Webflow?

How long does a project take?

Frequent questions

with specific answers.

What we do

Who we work with

How to get started

What are terms

What services do you offer exactly?

Do you also do brand work?

Why Framer over Webflow?

How long does a project take?

Frequent questions

with specific answers.

What we do

Who we work with

How to get started

What are terms

What services do you offer exactly?

Do you also do brand work?

Why Framer over Webflow?

How long does a project take?

Frequent questions

with specific answers.

What we do

Who we work with

How to get started

What are terms

What services do you offer exactly?

Do you also do brand work?

Why Framer over Webflow?

How long does a project take?

Frequent questions

with specific answers.

What we do

Who we work with

How to get started

What are terms

What services do you offer exactly?

Do you also do brand work?

Why Framer over Webflow?

How long does a project take?

Interested

in collaborating?

Interested

in collaborating?

Interested

in collaborating?

Interested

in collaborating?