Reading prompts from a file#
You can automateinvoke.py
by providing a text file with the prompts you want to run, one line per prompt. The text file must be created with a text editor (e.g. Notepad) and not with a word processor. Each line should look like what you would type at the invoke> prompt:
"a beautiful sunny day in the park, children playing" -n4 -C10"stormy weather on a mountaintop, goats graze" -s100"innovative packaging for a squid dinner" -S137038382
Then pass the name of this fileinvoke.py
when you call it:
Python scripts/invoke.py --from file "/pfad/zu/prompts.txt"
You can also read a series of prompts from standard input by specifying a filename of-
. For example, here's a Python script that creates a matrix of prompts, each slightly varying:
#!/usr/bin/env pythonadjectives = ['sunny','rainy','covered']Sampler = ['k_lms','k_euler_a','k_heun']cfg = [7.5, 9, 11]Pro adj in adjectives: Pro Probe in Sampler: Pro CG in cfg: press(f'ein {adj} Tag -A{samp} -C{cg}')
The output looks like this (abbreviated):
a sunny Tag -Aklms -C7.5a sunny Tag -Aklms -C9a sunny Tag -Aklms -C11a sunny Tag -Ak_euler_a -C7.5a sunny Tag -Ak_euler_a -C9...a covered Tag -Ak_heun -C9a covered Tag -Ak_heun -C11
To feed it to invoke.py pass the filename "-"
Python matrix.py | Python scripts/invoke.py --from file -
When the script is finished, each of the 27 combinations of adjective, sampler, and CFG is executed.
The command line interface provides!bring
and!Repetition
Commands that allow you to read the prompts from a single previously generated image or an entire directory from it, write the prompts to a file, and then play them back. Or you can create your own command prompt file and send it to the command line client from an interactive session. Please refercommand line interfacefor details.
Negative and unconditional requests#
Any word between two square brackets tells Stable Diffusion to try to banish the concept from the generated image.
This is a test prompt [not really] so you [cool] understand how this works.
In the statement above, the words "not really cool" are ignored by StableDiffusion.
Here's a prompt that illustrates what it does.
original prompt:
"A fantastic translucent pony of water and foam, ethereal, radiant, hyperalism, Scottish folklore, digital painting, art station, concept art, smooth, 8k Frostbite 3 engine, ultra-detailed, art by Artgerm and Greg Rutkowski and Magali Villeneuve" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180
This picture shows a woman. So if we want the horse without a rider, we can manipulate the image to not have a woman by putting [woman] in the prompt, like this:
"A fantastic translucent pony of water and foam, ethereal, radiant, hyperalism, Scottish folklore, digital painting, artstation, concept art, smooth, 8k Frostbite 3 engine, ultra-detailed, art by Artgerm and Greg Rutkowski and Magali Villeneuve [woman ]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180
That's nice - but let's say we don't want the image to be all that blue either. We can add "blue" to the list of negative prompts, so now it's [mrs blue]:
"A fantastic translucent pony of water and foam, ethereal, radiant, hyperalism, Scottish folklore, digital painting, artstation, concept art, smooth, 8k Frostbite 3 engine, ultra-detailed, art by Artgerm and Greg Rutkowski and Magali Villeneuve [mrs blue]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180
Close - but there's no point in having a saddle if our horse doesn't have a rider, so let's add another negative note: [Mrs. Blue Saddle].
"A fantastic translucent pony of water and foam, ethereal, radiant, hyperalism, Scottish folklore, digital painting, artstation, concept art, smooth, 8k Frostbite 3 engine, ultra-detailed, art by Artgerm and Greg Rutkowski and Magali Villeneuve [Mrs. Blue Saddle] " -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180
Notes on this function:
- The only requirement for ignore words is that they are between two square brackets.
- You can specify multiple words within the same bracket.
- You can specify multiple parentheses containing multiple words in various places in your prompt. That works fine.
- To improve typical anatomical problems, you can add negative prompts like
[bad anatomy, extra legs, extra arms, extra fingers, badly drawn hands, badly drawn feet, disfigured, out of frame, tiles, bad art, deformed, mutated]
.
Command Prompt Syntax Features#
The InvokeAI prompt language has the following features:
attention weighting#
Add a word or phrase with-
or+
, or a weight in between0
and2
(1
=default) to decrease or increase "attention" (= a mix of CFG weight multiplier per token and e.g-
, a weighted mix with the prompt without the term).
The following syntax is recognized:
- single words without brackets:
a tall, thin man picking apricots
- single or multiple words with brackets:
a tall, thin man who picks (apricots).
a tall thin man is picking (apricots)-
a tall thin man (picking apricots)+
a tall thin man (picking apricots)-
- more impact with more symbols
a tall thin man (picking apricots)++
- nesting
a tall thin man (picking apricots+)++
(apricots
gets effectively+++
) - all of the above with explicit numbers
a tall thin man picking (apricots)1.1
a tall thin man (picking (apricots)1.3)1.1
. (+
equals 1.1,++
is pow(1.1,2),+++
is pow(1.1,3) etc.;-
means 0.9,--
means pow(0.9,2) etc.) - Attention also applies to
[unconditionally]
soa tall, thin man picking apricots [(ladder)0.01]
Willevery softshove SD from pulling the man up a ladder
You can use this to increase or decrease the amount of something. Starting from this prompt ofA man picks apricots from a tree
Let's see what happens if we increase or decrease the amount of attention we want Stable Diffusion to give to the wordapricots
:
Use-
to reduce the apricot flavor:
a man picking apricots from a tree | a man picking apricots from a tree | a man picking apricots from a tree |
---|---|---|
![]() | ![]() | ![]() |
Use+
to increase the apricot flavor:
a man plucking apricots+ from a tree | a man picking apricots++ from a tree | a man picks apricots+++ from a tree | a man picking apricots ++++ from a tree | a man picks apricots+++++ from a tree |
---|---|---|---|---|
![]() | ![]() | ![]() | ![]() | ![]() |
You can also change the balance between different parts of a prompt. For example, below is aBergmann
:
And here he is with more mountain:
Berg + Man | Berg++ Man | Berg+++ Man |
---|---|---|
![]() | ![]() | ![]() |
Or alternatively with more men:
Bergmann+ | Bergmann++ | Bergmann+++ | Bergmann++++ |
---|---|---|---|
![]() | ![]() | ![]() | ![]() |
Shuffle between prompts#
("a tall thin man picking apricots", "a tall thin man picking pears").blend(1,1)
- The existing prompt mix with
:<weight>
is still supported -("a tall thin man picking apricots", "a tall thin man picking pears").blend(1,1)
is equivalent toa tall thin man picking apricots:1 a tall thin man picking pears:1
in the old syntax. - Attention weights can be nested in blends.
- Non-normalized blends are supported by passing
no_normalize
as an additional argument to the mixture weights, e.g("a tall thin man picking apricots", "a tall thin man picking pears").blend(1,-1,no_normalize)
. It's a lot of fun exploring local maxima in feature space, but it's also easy to generate garbage output.
For more information on how this works, see the Prompt Blending section below.
Cross-attention control ('prompt2prompt')#
Sometimes an image you create is almost right and you want to change just one detail without affecting the rest. You could use a photo editor and inpainting to paint over the area, but it's tedious. Here is whereCommand Prompt2Command Prompt
comes unwieldy.
Generate an image with a specific prompt, record the seed of the image, and then use theCommand Prompt2Command Prompt
Syntax to replace words in the original prompt with words in a new prompt. This works forimg2img
even.
a ("fluffy cat").swap ("smiling dog") eating a hot dog
.- Quotation marks optional:
a (fluffy cat).swap(smiling dog) eating a hot dog
. - for single word substitutions, parentheses are also optional:
a cat.swap (dog) eating a hot dog
.
- Quotation marks optional:
- Supports options
s_start
,from you
,t_start
,tend
(each 0-1) roughly corresponds to that of bloc97prompt_edit_spatial_start/_end
andprompt_edit_tokens_start/_end
but with the math reversed to facilitate intuitive understanding.- Example Usage:
a (cat).swap(dog, s_end=0.3) eats a hot dog
- thefrom you
Argument means that the "spatial" (self-awareness) edit has no effect after 30% (= 0.3) of the steps have been taken, leaving stable diffusion with 70% of the steps, being free to decide for yourself, like transforming the cat form into a dog form. - The numbers represent a percentage through the sequence of steps where the edits should take place. 0 means start (noisy start picture), 1 end (end picture).
- For img2img, the sequence of steps doesn't start at 0, but at (1-Strength) - so if the strength is 0.7, s_start and s_end must both be greater than 0.3 (1-0.7) to have an effect .
- Example Usage:
- Comfort option
form_freedom
(0-1) to indicate how much "freedom" Stable Diffusion should have to change the shape of the subject being swapped.a (cat).swap(dog, shape_freedom=0.5) eats a hot dog
.
DieCommand Prompt2Command Prompt
code is based onblock97s co.
Note thatCommand Prompt2Command Prompt
does not currently work with the RunwayML inpainting model and may never work due to the way that model is set up. If you try to useCommand Prompt2Command Prompt
You will get back the original image. However, since this model is so good at painting, it is a good substitute to useClipseg
Textmaskierungsoption:
call> a fluffy Cat Essen a HotdotOutputs:[1010] Outputs/000025.2182095108.png: a fluffy Cat Essen a Hotdogcall> a smiling dog Essen a Hotdog -I 000025.2182095108.png -tm Cat
Escape brackets () and speech characters ""#
If the model you are using contains parentheses () or punctuation marks "" as part of its syntax, you must "escape" them with a backslash so that(my_keyword)
will\(my_keyword\)
. Otherwise, the prompt parser tries to interpret the brackets as part of the prompt syntax and gets confused.
Instant Mixing#
You can piece together different sections of the prompt to explore the AI's latent semantic space and create interesting (and often surprising!) variations. The syntax is:
blau Kugel: 0,25 rot Dice: 0.75 hybrid
This tells the sampler to mix 25% of the concept of a blue sphere with 75% of the concept of a red cube. The blend weights can use any combination of integers and floating point numbers and do not have to add up to 1. Everything to the left of:XX
until the previous one:XX
Shaping is used, the overall effect is:
0.25 * "blue ball" + 0.75 * "white duck" + hybrid
As you explore the "mind" of the AI, the way the AI mixes two concepts may not match yours, leading to surprising effects. To illustrate, here are three images generated using different combinations of mix weights. As usual, unless you fix the seed, the prompts will return different results each time you run them.
"blue ball, red cube, hybrid"#
This example doesn't use any merging at all and represents the standard method of blending concepts.
It's interesting to see how the AI expressed the concept of the "cube" as the four quadrants of the enclosing frame. If you look closely, there is depth there, so the enclosing frame is actually a cube.
"Blue Ball:0.25 Red Cube:0.75 Hybrid"#
That is interesting. We get neither a blue sphere nor a red cube, but a red sphere embedded in a brick wall, representing a fusion of concepts within the “latent space” of AI semantic representations. Where is Ludwig Wittgenstein when you need him?
"Blue Ball:0.75 Red Cube:0.25 Hybrid"#
Auf jeden Fall blau-kugeliger. Der Würfel ist ganz weg, aber es ist wirklich coole abstrakte Kunst.
"Blue Ball:0.5 Red Cube:0.5 Hybrid"#
Wow...! I see blue and red, but no balls or cubes. Does the word "hybrid" conjure up the concept of some kind of scifi creature? let's find out
"blue ball: 0.5 red cube: 0.5"#
Indeed, removing the word "hybrid" creates an image closer to what we expect.
In summary, rapid mixing is great for exploring the creative space, but directing can be tricky. An upcoming version of InvokeAI will have a more deterministic weighting of prompts.
Last update:December 23, 2022
Created:18. September 2022