Image models

Text-to-image generation, image editing, and upscaling. Parameter tables are each model’s input schema; our wrapper params (out, mock, format) are noted per model.

Generations are charged in credits (see Credits & plans). Every generation model also accepts mock: true for a free placeholder result.

FLUX.1 Schnell `flux_schnell`

Turbo-mode (1-4 step) text-to-image generation from a 12B-parameter FLUX flow transformer — fast enough for prototyping, prompt iteration, and bulk draft runs.

Call it via — image tool, action: "create", tier: "draft" (the default tier) · raw: POST /v1/jobs/flux_schnell


Cost	1 cr per call
Mode / timeout	sync / 30s (from our YAML)

Parameters — the model's input schema:

Param	Type	Required	Default	Allowed / range	Description
`prompt`	string	✓	—	—	The prompt to generate an image from.
`image_size`	string \| object		`landscape_4_3`	enum: `square_hd`, `square`, `portrait_4_3`, `portrait_16_9`, `landscape_4_3`, `landscape_16_9` — or `{width, height}` object (each 1–14142)	The size of the generated image.
`num_inference_steps`	integer		`4`	1–12	The number of inference steps to perform.
`num_images`	integer		`1`	1–4	The number of images to generate.
`guidance_scale`	number		`3.5`	1–20	CFG scale — how closely the model sticks to the prompt.
`seed`	integer \| null		`null`	—	Same seed + same prompt + same model version → same image.
`output_format`	string		`jpeg`	enum: `jpeg`, `png`	The format of the generated image.
`enable_safety_checker`	boolean		`true`	—	If true, the safety checker is enabled.
`acceleration`	string		`none`	enum: `none`, `regular`, `high`	Generation speed — higher is faster.
`sync_mode`	boolean		`false`	—	If true, media returns as a data URI and isn't stored in request history.

Our wrapper params (not part of the model input schema): out (required — output filename/workdir-relative path), mock (optional — test placeholder), and format (optional — our size preset shorts/reels/horizontal, mapped to the model's image_size field: shorts/reels → portrait_16_9, horizontal → landscape_16_9, default → portrait_16_9).

Limits — billed at 1 cr per megapixel, rounded up to the nearest megapixel. Custom image_size max 14142 × 14142 px. Up to 4 images per call; 1–12 inference steps. (No prompt character limit, duration, frame count, or file-size limit is published for this model.)

FLUX 1.1 [pro] ultra `flux_pro`

Text-to-image generation at up to 2K resolution (4 megapixels) with enhanced photorealism, with optional reference-image conditioning.

Call it via — image tool, action: "create", tier: "fine" (MCP) · raw: POST /v1/jobs/flux_pro


Cost	12 cr per call
Mode / timeout	sync / 30s (from our YAML)

Parameters — the model's input schema:

Param	Type	Required	Default	Allowed / range	Description
`prompt`	string	✓	—	—	The prompt to generate an image from.
`seed`	integer		null	—	Same seed + same prompt + same model version → same image.
`sync_mode`	boolean		`false`	—	If true, media is returned as a data URI and not stored in request history.
`num_images`	integer		`1`	1–4	Number of images to generate.
`output_format`	string		`jpeg`	`jpeg`, `png`	Format of the generated image.
`safety_tolerance`	string		`"2"`	`"1"`–`"6"`	Content-filter level; 1 = most strict, 6 = most permissive.
`enhance_prompt`	boolean		`false`	—	Whether to enhance the prompt for better results.
`image_url`	string		null	—	Reference image URL to condition generation on.
`image_prompt_strength`	number		`0.1`	0–1	Strength of the image prompt (reference-image influence).
`aspect_ratio`	string		`9:16`	`21:9`, `16:9`, `4:3`, `3:2`, `1:1`, `2:3`, `3:4`, `9:16`, `9:21` (free-form string also accepted)	Aspect ratio of the generated image.
`raw`	boolean		`false`	—	Generate less processed, more natural-looking images.

Our wrapper params (not part of the model input schema): out (required — output filename/path), mock (optional — test placeholder), and format (optional — size preset mapped to the model's aspect_ratio field: shorts/reels→9:16, horizontal→16:9, default 9:16).

Limits — model limits:

Max resolution: 4 megapixels (up to 2048×2048). Billing rounds up to the nearest megapixel.
Max images per call: 4 (num_images).
image_prompt_strength range: 0–1.
Output formats: JPEG, PNG.

Flux 2 LoRA Realism `flux_realism`

Text-to-image photorealism — FLUX.2 with a realism LoRA tuned for natural lighting, skin texture, and documentary-style detail; ideal for character portraits, people, products, and lifestyle scenes.

Call it via — image(action: "create", tier: "photo") · raw: POST /v1/jobs/flux_realism


Cost	Billed per megapixel — ≈4–5 cr per image at the ~1 MP presets
Mode / timeout	sync / 60s (from our YAML)

Parameters — the model's input schema:

Param	Type	Required	Default	Allowed / range	Description
`prompt`	string	✓	—	—	The prompt to generate a realistic image with natural lighting and authentic details.
`image_size`	enum \| object		`landscape_4_3`	`square_hd`, `square`, `portrait_4_3`, `portrait_16_9`, `landscape_4_3`, `landscape_16_9` — or an object `{width, height}` (each int, >0, max 14142)	The size of the generated image.
`guidance_scale`	number		`2.5`	`0`–`20`	CFG scale. How closely the model follows the prompt.
`num_inference_steps`	integer		`40`	`4`–`50`	Number of inference steps; higher enhances realism.
`acceleration`	enum		`regular`	`none`, `regular`	Acceleration level; `regular` balances speed and quality.
`seed`	integer \| null		none	—	Random seed for reproducibility; same seed + prompt → same result.
`sync_mode`	boolean		`false`	—	If true, media is returned as a data URI and not saved in history.
`enable_safety_checker`	boolean		`true`	—	Whether to enable the safety checker for the generated image.
`output_format`	enum		`png`	`png`, `jpeg`, `webp`	The format of the output image.
`num_images`	integer		`1`	`1`–`4`	Number of images to generate per call.
`lora_scale`	number		`1`	`0`–`2`	Strength of the realism effect.

Our wrapper params (not part of the model input schema): out (required — output filename), mock (optional — test placeholder), and format (optional — our friendly aspect preset, e.g. shorts/reels/horizontal, which we map to the model's image_size field via format_mapping: shorts/reels → portrait_16_9, horizontal → landscape_16_9).

Limits — max 4 images per call (num_images 1–4); inference steps 4–50; custom image_size object dimensions up to 14142 px per side (max ~4 MP recommended); output formats PNG / JPEG / WebP; text prompt only (no image input).

Nano Banana Pro `nano_banana`

Text-to-image on Google's Nano Banana Pro (Gemini 3 Pro Image): strong prompt adherence and best-in-class text rendering inside the image — posters, labels, UI mockups, and scenes that must follow the brief closely.

Call it via — image tool, action: "create", model: "nano_banana" (explicit model — the tier presets map to the FLUX family) · raw: POST /v1/jobs/nano_banana


Cost	30 cr per call; 4K outputs charged at 2x
Mode / timeout	sync / 2m (from our YAML)

Parameters — the model's input schema:

Param	Type	Required	Default	Allowed / range	Description
`prompt`	string	✓	—	—	What to generate.
`num_images`	integer		`1`	1–4	Number of images to generate.
`seed`	integer		—	any int	Seed for the RNG.
`aspect_ratio`	string (enum)		`1:1`	`21:9`, `16:9`, `3:2`, `4:3`, `5:4`, `1:1`, `4:5`, `3:4`, `2:3`, `9:16`	Aspect ratio of the output.
`output_format`	string (enum)		`png`	`jpeg`, `png`, `webp`	Format of the generated image.
`resolution`	string (enum)		`1K`	`1K`, `2K`, `4K`	Output resolution (4K costs 2x).

Our wrapper params (not part of the model input schema): out (required — workdir-relative output path), mock (optional — test placeholder), and format (optional — friendly size preset shorts/reels/horizontal, mapped to the model's aspect_ratio via format_mapping: shorts/reels → 9:16, horizontal → 16:9, default 1:1).

Limits — text prompt only (no image input; for instruction-based editing use nano_banana_edit); all outputs carry SynthID watermarking.

Nano Banana Pro Edit `nano_banana_edit`

Instruction-based image editing built on Google's Gemini 3 Pro Image (Nano Banana 2): modify, restyle, inpaint, or compose images via natural-language instructions with no masks.

Call it via — image(edit) MCP tool/action routes to our default editor (seedream_v5_edit); nano_banana_edit is a registered editor reachable directly · raw: POST /v1/jobs/nano_banana_edit


Cost	30 cr per call; 4K outputs charged at 2x; web search adds 3 cr
Mode / timeout	sync / 60s (from our YAML)

Parameters — the model's input schema:

Param	Type	Required	Default	Allowed / range	Description
`prompt`	string	✓	—	length 3–50000 chars	The prompt / editing instruction.
`image_urls`	array[string]	✓	—	up to 14 images	URLs of the images to edit / compose.
`num_images`	integer		`1`	1–4	Number of images to generate.
`seed`	integer		—	any int (nullable)	Seed for the RNG.
`aspect_ratio`	string (enum)		`auto`	`auto`, `21:9`, `16:9`, `3:2`, `4:3`, `5:4`, `1:1`, `4:5`, `3:4`, `2:3`, `9:16`	Aspect ratio of the output (`auto` preserves source proportions).
`output_format`	string (enum)		`png`	`jpeg`, `png`, `webp`	Format of the generated image.
`safety_tolerance`	string (enum)		`4`	`1`–`6`	Content-moderation tolerance (1 strictest, 6 least strict).
`sync_mode`	boolean		`false`	—	If true, media is returned as a data URI and is not kept in request history.
`system_prompt`	string		`""`	length ≤ 50000 chars	Optional system instruction steering persona/output style.
`resolution`	string (enum)		`1K`	`1K`, `2K`, `4K`	Output resolution (4K costs 2x).
`limit_generations`	boolean		`false`	—	Experimental: cap each prompting round to 1 image, ignoring count hints in the prompt.
`enable_web_search`	boolean		`false`	—	Allow the model to use live web data (adds 3 cr).

Our wrapper params (not part of the model input schema): out (required — workdir-relative output path), mock (optional — test placeholder), and format (optional — friendly size preset shorts/reels/horizontal, which our config maps to the model's aspect_ratio field via format_field: aspect_ratio → shorts/reels=9:16, horizontal=16:9; with no explicit format the default is auto — the edit preserves the source image's aspect ratio).

Limits — prompt 3–50000 chars; system_prompt ≤ 50000 chars; num_images 1–4; up to 14 input images per composition; character consistency for up to 5 people; resolutions 1K (1024px) / 2K (2048px) / 4K; input images capped at ~89,478,485 pixels (oversized inputs rejected with 422 image_too_large); output formats PNG / JPEG / WebP; all outputs carry SynthID watermarking.

Seedream v4.5 Edit `seedream_v45_edit`

Edit and compose images at high resolution from natural-language instructions, referencing up to 10 source images in one unified generation/editing architecture.

Call it via — the image MCP tool with action: "edit" is the user-facing edit route, but note that action currently maps to seedream_v5_edit; this v4.5 variant is reached by calling the model directly. · raw: POST /v1/jobs/seedream_v45_edit


Cost	8 cr per call
Mode / timeout	sync / 60s (from our YAML)

Parameters — the model's input schema:

Param	Type	Required	Default	Allowed / range	Description
`prompt`	string	✓	—	—	Text prompt used to edit the image.
`image_urls`	array<string>	✓	—	up to 10 URLs	Input images for editing. If more than 10 are sent, only the last 10 are used.
`image_size`	object `{width,height}` or enum string		`{width: 2048, height: 2048}`	enum: `square_hd`, `square`, `portrait_4_3`, `portrait_16_9`, `landscape_4_3`, `landscape_16_9`, `auto_2K`, `auto_4K`; or object with width/height each 1920–4096	Output size. Width and height must each be 1920–4096, and total pixels between 2560×1440 and 4096×4096.
`num_images`	integer		`1`	1–6	Number of separate model generations to run with the prompt.
`max_images`	integer		`1`	1–6	If >1, enables multi-image output: up to `max_images` per generation, `num_images` generations total. Total images (inputs + outputs) must not exceed 15.
`seed`	integer (nullable)		null	—	Random seed to control stochasticity.
`sync_mode`	boolean		`false`	—	If true, media is returned as a data URI and is not stored in request history.
`enable_safety_checker`	boolean		`true`	—	Enables the safety checker.

Our wrapper params (not part of the model input schema): out (required — output filename / workdir-relative path), mock (optional — test placeholder), and format (optional — our preset that we map to the model's image_size field via format_mapping: shorts/reels → 1080×1920, horizontal → 1920×1080).

Limits — up to 10 input reference images (last 10 used if more provided); max total images (inputs + outputs) = 15; output resolution 1920–4096 px per axis, total pixels 2560×1440 to 4096×4096 (max 4 MP / 2048×2048 typical); output format PNG via URL or data URI; ~60s inference.

Seedream v5 Lite Edit `seedream_v5_edit`

Fast, intelligent image editing from Seedream 5.0 Lite — modify existing images, add/remove elements, composite characters into scenes, and apply style/color transfer, with up to 10 reference images per call.

Call it via — image(action: "edit", image_url, prompt) (MCP tool image, action edit) · raw: POST /v1/jobs/seedream_v5_edit


Cost	7 cr per call
Mode / timeout	sync / 60s (from our YAML)

Parameters — the model's input schema:

Param	Type	Required	Default	Allowed / range	Description
`prompt`	string	✓	—	—	Text prompt describing the edit to apply.
`image_urls`	string[]	✓	—	up to 10 images	URLs of input images to edit. If more than 10 are sent, only the last 10 are used.
`image_size`	ImageSize object \| enum string	—	`auto_2K`	enum: `square_hd`, `square`, `portrait_4_3`, `portrait_16_9`, `landscape_4_3`, `landscape_16_9`, `auto_2K`, `auto_3K`, `auto_4K`; or `{width, height}` (each 1–14142). Total pixels must be 2560×1440…4096×4096, else scaled.	Output image size, as a preset enum or explicit width/height.
`num_images`	integer	—	`1`	1–6	Number of separate generations to run with the prompt.
`max_images`	integer	—	`1`	1–6	If >1, enables multi-image generation: up to `max_images` images per generation, so total output is between `num_images` and `max_images×num_images`.
`sync_mode`	boolean	—	`false`	true / false	If true, media is returned as a data URI and output isn't stored in request history.
`enable_safety_checker`	boolean	—	`true`	true / false	If true, the content safety checker is enabled.

Our wrapper params (not part of the model input schema): out (required — workdir-relative output filename), mock (optional — test placeholder, no real generation). Our format (optional — shorts/reels/horizontal) is a wrapper we map to the model's image_size field as an explicit {width, height} object (shorts/reels → 1080×1920, horizontal → 1920×1080).

Limits — model limits:

Max reference images: 10 (last 10 used if more are sent).
Max resolution: 3072×3072 (9 MP); total pixel count supported between 2560×1440 (≈3.7 MP) and 4096×4096 (≈9.43 MP, scaled to fit).
Batch: 1–6 generations per call (num_images), up to 6 images each (max_images).
Output format: PNG delivered via HTTPS URL (or data URI when sync_mode=true).

Topaz Image Upscale `topaz_upscale_image`

Topaz image enhancer — upscale and enhance images (add detail, face enhancement, sharpening, denoising, compression-artifact removal, and generative detail).

Call it via — image tool, action: "upscale" (MCP) · raw: POST /v1/jobs/topaz_upscale_image


Cost	16 cr per call
Mode / timeout	sync / 120s (from our YAML)

Parameters — the model's input schema:

Param	Type	Required	Default	Allowed / range	Description
`image_url`	string	✓	—	non-empty URL	URL of the image to be upscaled.
`model`	string (enum)		`Standard V2`	`Low Resolution V2`, `Standard V2`, `CGI`, `High Fidelity V2`, `Text Refine`, `Recovery`, `Redefine`, `Recovery V2`, `Standard MAX`, `Wonder`, `Wonder 3`	Model to use for image enhancement.
`upscale_factor`	number		`2`	`1`–`4`	Factor to upscale the image by (2.0 doubles width and height).
`crop_to_fill`	boolean		`false`	true / false	Crop the output to fill the target size.
`output_format`	string (enum)		`jpeg`	`jpeg`, `png`	Output format of the upscaled image.
`subject_detection`	string (enum)		`All`	`All`, `Foreground`, `Background`	Subject detection mode. Applies to standard enhance and Recovery V2 models.
`face_enhancement`	boolean		`true`	true / false	Apply face enhancement. Applies to standard enhance and Recovery V2 models.
`face_enhancement_creativity`	number		`0`	`0`–`1`	Creativity for face enhancement; 0 = none, 1 = max. Ignored if face enhancement is disabled.
`face_enhancement_strength`	number		`0.8`	`0`–`1`	Strength of face enhancement; 0 = none, 1 = max. Ignored if face enhancement is disabled.
`sharpen`	number		—	`0`–`1`	Sharpening level. Applies to Standard V2, Low Resolution V2, CGI, High Fidelity V2, Text Refine, Redefine.
`denoise`	number		—	`0`–`1`	Denoising level. Applies to Standard V2, Low Resolution V2, CGI, High Fidelity V2, Text Refine, Redefine.
`fix_compression`	number		—	`0`–`1`	Compression-artifact removal. Applies to Standard V2, Low Resolution V2, High Fidelity V2, Text Refine.
`strength`	number		—	`0.01`–`1`	Enhancement strength. Applies to Text Refine model only.
`creativity`	integer		—	`1`–`6`	Generative creativity (higher = more hallucinated detail). Applies to Redefine model only.
`texture`	integer		—	`1`–`5`	Texture detail level for generative upscaling. Applies to Redefine model only.
`prompt`	string		—	max 1024 chars	Text prompt to guide generative upscaling. Applies to Redefine model only.
`autoprompt`	boolean		—	true / false	Auto-generate the prompt for generative upscaling. Applies to Redefine model only.
`detail`	number		—	`0`–`1`	Detail recovery level. Applies to Recovery V2 model only.
`enhancement_strength`	string (enum)		—	`low`, `medium`, `high`	Enhancement strength for generative upscaling. Applies to Wonder 3 model only; auto-configured when omitted.

Our wrapper params (not part of the model input schema): out (required — workdir-relative output filename), mock (optional — test placeholder). This model has no format mapping (format_field is empty), so no model size field is derived from format.

Limits — model limits: upscale_factor 1–4; prompt ≤ 1024 chars; accepted input formats jpg, jpeg, png, webp, gif, avif. Catalog cost is a flat 16 cr per call (covers outputs up to ~24 MP).

Image models ​

FLUX.1 Schnell flux_schnell ​

FLUX 1.1 [pro] ultra flux_pro ​

Flux 2 LoRA Realism flux_realism ​

Nano Banana Pro nano_banana ​

Nano Banana Pro Edit nano_banana_edit ​

Seedream v4.5 Edit seedream_v45_edit ​

Seedream v5 Lite Edit seedream_v5_edit ​

Topaz Image Upscale topaz_upscale_image ​

Image models

FLUX.1 Schnell `flux_schnell`

FLUX 1.1 [pro] ultra `flux_pro`

Flux 2 LoRA Realism `flux_realism`

Nano Banana Pro `nano_banana`

Nano Banana Pro Edit `nano_banana_edit`

Seedream v4.5 Edit `seedream_v45_edit`

Seedream v5 Lite Edit `seedream_v5_edit`

Topaz Image Upscale `topaz_upscale_image`