An implementation of: Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
You need GPT4-Azure or Gemini Pro to use it. Local LLMs support is still being worked on.
You must log in or # to comment.
Neat! I’ve known that Regional Prompter is powerful, but it’s too much of a pain for me to bother using. Hopefully this makes it easier.