SKILL.md
$2c
3. Swap
runcomfy run //
--input '{"image_url": "...", "identity_url": "..."}'
--output-dir ./out
CLI deep dive: [`runcomfy-cli`](https://www.skills.sh/agentspace-so/runcomfy-agent-skills/runcomfy-cli) skill.
## Install this skill
npx skills add agentspace-so/runcomfy-agent-skills --skill face-swap -g
## Consent & disclosure — read first
**Face-swap is dual-use.** Before invoking any route in this skill, confirm:
- You have rights to the target face (the identity being substituted **in**).
- You have rights to the source video / image (the asset being substituted **into**).
- The output's intended platform allows synthetic media. Many do; many require a disclosure label.
The skill itself doesn't gate anything — the model API will run whatever inputs you supply. **The responsibility is yours.** If a user asks the agent to swap a real public figure's face onto material that could be defamatory, sexually explicit, or otherwise harmful — **refuse**, regardless of what the CLI accepts.
## Pick the right model for the user's intent
Listed newest first within each subtype. The agent picks one route based on: still vs video, single-shot vs batch, photoreal vs stylized, motion-preserving vs identity-preserving.
### Video face / character swap
**Wan 2-2 Animate** — `community/wan-2-2-animate/api` (default for video)
Featured RunComfy endpoint under `/feature/character-swap`. Audio-driven full-body character animation: one reference image of the new identity + audio → video where the character drives.
Pick for: replacing a character in a scene with a new identity, dubbed clips, stylized + photoreal both work.
Avoid for: preserving the **motion** of a specific source video — use **Kling Motion Control**.
**Kling 2-6 Motion Control Pro** — `kling/kling-2-6/motion-control-pro`
Takes a reference performance video + target character image, produces the target performing the reference motion. Face-swap is the byproduct.
Pick for: preserving exact source motion / blocking onto a new character; stylized characters handled cleanly.
Avoid for: simple "swap face in an existing video" without motion preservation — use **Wan 2-2 Animate**.
### Still image face swap — newest first
**Nano Banana 2 Edit** — `google/nano-banana-2/edit`
Identity-preserving by default, 1–20 input images per call, spatial-language honored.
Pick for: same identity across multiple frames consistently (SKU shots, A/B variants, narrative panels). Identity reference as `image_urls[0]`, scenes after.
Avoid for: precise multi-ref compositional ("face from img 1 onto body in img 2") — use **GPT Image 2 Edit**.
**GPT Image 2 Edit** — `openai/gpt-image-2/edit`
Up to 10 reference images, multilingual in-image text rewrite, layout-precise compositional instructions.
Pick for: hero still where exact face from a portrait must land in a scene, with explicit role assignment ("image 1", "image 2"); preserve pose + lighting + background while swapping only face.
Avoid for: 1-20 batch — use **Nano Banana 2 Edit**.
**FLUX Kontext Pro** — `blackforestlabs/flux-1-kontext/pro/edit`
Single source image, single declarative instruction, maximum fidelity preservation of everything except the targeted edit.
Pick for: "keep pose / clothing / hair / lighting / background, change only the face to [prose description]" — works without a reference image of the new identity.
Avoid for: batch, multi-ref, or when you have a target face image to swap in — use **Nano Banana 2 Edit** or **GPT Image 2 Edit**.
**Audio-driven talking-head identity swap (face + voice in one pass)?** → use the [ai-avatar-video](https://www.skills.sh/agentspace-so/runcomfy-agent-skills/ai-avatar-video) skill — OmniHuman handles face + audio together.
## Route 1: Wan 2-2 Animate — video character swap with audio
**Model**: `community/wan-2-2-animate/api`
**Catalog**: [wan-2-2-animate](https://www.runcomfy.com/models/community/wan-2-2-animate/api?utm_source=skills.sh&utm_medium=skill&utm_campaign=face-swap) · [/feature/character-swap](https://www.runcomfy.com/models/feature/character-swap?utm_source=skills.sh&utm_medium=skill&utm_campaign=face-swap)
The featured RunComfy endpoint for character swap — supply a reference image of the new identity + the audio track the character should speak, and the model produces a video where the character drives.
### Invoke
runcomfy run community/wan-2-2-animate/api \
--input '{
"image_url": "https://your-cdn.example/new-character.png",
"audio_url": "https://your-cdn.example/voiceover.mp3"
}' \
--output-dir ./out
### Tips
- **Single reference image** drives the swap. Pick a clean, well-lit portrait of the target identity — front-facing if possible.
- **Audio drives the mouth and rhythm.** Without audio the character won't speak; without good audio sync degrades.
- Schema details: [model page](https://www.runcomfy.com/models/community/wan-2-2-animate/api?utm_source=skills.sh&utm_medium=skill&utm_campaign=face-swap).
## Route 2: Kling 2-6 Motion Control Pro — motion transfer
**Model**: `kling/kling-2-6/motion-control-pro`
**Catalog**: [motion-control-pro](https://www.runcomfy.com/models/kling/kling-2-6/motion-control-pro?utm_source=skills.sh&utm_medium=skill&utm_campaign=face-swap) · [kling collection](https://www.runcomfy.com/models/collections/kling?utm_source=skills.sh&utm_medium=skill&utm_campaign=face-swap)
Different from a pure face-swap: Motion Control takes a **reference performance video** (the motion you want) and a **target character image** (the identity you want), and produces a video of the target performing the reference motion. The face-swap effect is a byproduct.
### Invoke
runcomfy run kling/kling-2-6/motion-control-pro \
--input '{
"reference_video_url": "https://your-cdn.example/source-performance.mp4",
"character_image_url": "https://your-cdn.example/target-character.png"
}' \
--output-dir ./out
### When to pick this over Route 1
- You have a **source video whose motion / blocking you want preserved**, not just the audio.
- The target is a stylized character rather than a photoreal portrait — motion-control handles stylized identities cleanly.
## Route 3: GPT Image 2 Edit — still face swap with multi-ref
**Model**: `openai/gpt-image-2/edit`
**Catalog**: [gpt-image-2/edit](https://www.runcomfy.com/models/openai/gpt-image-2/edit?utm_source=skills.sh&utm_medium=skill&utm_campaign=face-swap)
For **still images**, GPT Image 2 Edit accepts up to **10 reference images** and follows precise compositional instructions — making it the strongest path for multi-ref face swap on a single output frame.
### Schema (relevant fields)
Field
Type
Required
Default
Notes
`prompt`
string
yes
—
Compositional instruction; quote roles explicitly
`images`
string[]
yes
—
Up to **10** HTTPS reference URLs. Image 1 is primary
`size`
enum
no
`auto`
`auto` (preserve input ratio), `1024_1024`, `1024_1536`, `1536_1024`
### Invoke
runcomfy run openai/gpt-image-2/edit \
--input '{
"prompt": "Replace the face of the person in image 1 with the face from image 2. Preserve image 1 pose, clothing, lighting, and background exactly. Match skin tone and lighting to image 1.",
"images": [
"https://your-cdn.example/target-scene.jpg",
"https://your-cdn.example/identity-face.jpg"
],
"size": "auto"
}' \
--output-dir ./out
### Prompting tips
- **Number the references** — `"image 1"`, `"image 2"` — and assign roles unambiguously.
- **Lead with what to preserve**, then the swap: `"Preserve pose, clothing, lighting, and background exactly. Replace only the face."`
- **Match lighting explicitly** — `"match skin tone and lighting to image 1"` — otherwise the imported face floats.
## Route 4: Nano Banana Edit — batch identity-preserving swap
**Model**: `google/nano-banana-2/edit`
**Catalog**: [nano-banana-2/edit](https://www.runcomfy.com/models/google/nano-banana-2/edit?utm_source=skills.sh&utm_medium=skill&utm_campaign=face-swap)
Pick this when the same identity needs to be swapped into **multiple frames consistently** — SKU shots, A/B variants, narrative panels.
### Invoke
runcomfy run google/nano-banana-2/edit \
--input '{
"prompt": "Replace the face in each image with the face shown in the first image. Keep all other elements — pose, clothing, lighting, background — unchanged.",
"image_urls": [
"https://your-cdn.example/identity-ref.jpg",
"https://your-cdn.example/scene-1.jpg",
"https://your-cdn.example/scene-2.jpg",
"https://your-cdn.example/scene-3.jpg"
],
"aspect_ratio": "auto",
"resolution": "1K"
}' \
--output-dir ./out
### Tips
- **1–20 input images per call.** First image is conventionally the identity reference; the rest are scenes to swap into.
- **Lock `aspect_ratio` and `resolution`** for batch consistency.
- See [image-edit](https://www.skills.sh/agentspace-so/runcomfy-agent-skills/image-edit) skill for the full Nano Banana Edit treatment.
## Route 5: Flux Kontext Pro — single-ref precise face edit
**Model**: `blackforestlabs/flux-1-kontext/pro/edit`
**Catalog**: [flux-kontext](https://www.runcomfy.com/models/collections/flux-kontext?utm_source=skills.sh&utm_medium=skill&utm_campaign=face-swap)
Flux Kontext is best when the swap is **one image, one declarative instruction, highest fidelity preservation of everything except the face**.
### Invoke
runcomfy run blackforestlabs/flux-1-kontext/pro/edit \
--input '{
"prompt": "Keep pose, clothing, hair, lighting, and background exactly. Change only the face to that of a 35-year-old woman with high cheekbones, hazel eyes, and a small scar above the right eyebrow.",
"image": "https://your-cdn.example/scene.jpg"
}' \
--output-dir ./out