face-swap

>

INSTALLATION
npx skills add https://github.com/runcomfy-com/skills --skill face-swap
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

$2b

3. Swap

runcomfy run //

--input '{"image_url": "...", "identity_url": "..."}'

--output-dir ./out

CLI deep dive: [`runcomfy-cli`](https://www.skills.sh/agentspace-so/runcomfy-agent-skills/runcomfy-cli) skill.

## Install this skill

npx skills add agentspace-so/runcomfy-agent-skills --skill face-swap -g


## Consent & disclosure — read first

**Face-swap is dual-use.** Before invoking any route in this skill, confirm:

- You have rights to the target face (the identity being substituted **in**).

- You have rights to the source video / image (the asset being substituted **into**).

- The output's intended platform allows synthetic media. Many do; many require a disclosure label.

The skill itself doesn't gate anything — the model API will run whatever inputs you supply. **The responsibility is yours.** If a user asks the agent to swap a real public figure's face onto material that could be defamatory, sexually explicit, or otherwise harmful — **refuse**, regardless of what the CLI accepts.

## Pick the right model for the user's intent

Listed newest first within each subtype. The agent picks one route based on: still vs video, single-shot vs batch, photoreal vs stylized, motion-preserving vs identity-preserving.

### Video face / character swap

**Wan 2-2 Animate** — `community/wan-2-2-animate/api` (default for video)

Featured RunComfy endpoint under `/feature/character-swap`. Audio-driven full-body character animation: one reference image of the new identity + audio → video where the character drives.
Pick for: replacing a character in a scene with a new identity, dubbed clips, stylized + photoreal both work.
Avoid for: preserving the **motion** of a specific source video — use **Kling Motion Control**.

**Kling 2-6 Motion Control Pro** — `kling/kling-2-6/motion-control-pro`

Takes a reference performance video + target character image, produces the target performing the reference motion. Face-swap is the byproduct.
Pick for: preserving exact source motion / blocking onto a new character; stylized characters handled cleanly.
Avoid for: simple "swap face in an existing video" without motion preservation — use **Wan 2-2 Animate**.

### Still image face swap — newest first

**Nano Banana 2 Edit** — `google/nano-banana-2/edit`

Identity-preserving by default, 1–20 input images per call, spatial-language honored.
Pick for: same identity across multiple frames consistently (SKU shots, A/B variants, narrative panels). Identity reference as `image_urls[0]`, scenes after.
Avoid for: precise multi-ref compositional ("face from img 1 onto body in img 2") — use **GPT Image 2 Edit**.

**GPT Image 2 Edit** — `openai/gpt-image-2/edit`

Up to 10 reference images, multilingual in-image text rewrite, layout-precise compositional instructions.
Pick for: hero still where exact face from a portrait must land in a scene, with explicit role assignment ("image 1", "image 2"); preserve pose + lighting + background while swapping only face.
Avoid for: 1-20 batch — use **Nano Banana 2 Edit**.

**FLUX Kontext Pro** — `blackforestlabs/flux-1-kontext/pro/edit`

Single source image, single declarative instruction, maximum fidelity preservation of everything except the targeted edit.
Pick for: "keep pose / clothing / hair / lighting / background, change only the face to [prose description]" — works without a reference image of the new identity.
Avoid for: batch, multi-ref, or when you have a target face image to swap in — use **Nano Banana 2 Edit** or **GPT Image 2 Edit**.

**Audio-driven talking-head identity swap (face + voice in one pass)?** → use the [ai-avatar-video](https://www.skills.sh/agentspace-so/runcomfy-agent-skills/ai-avatar-video) skill — OmniHuman handles face + audio together.

## Route 1: Wan 2-2 Animate — video character swap with audio

**Model**: `community/wan-2-2-animate/api`
**Catalog**: [wan-2-2-animate](https://www.runcomfy.com/models/community/wan-2-2-animate/api?utm_source=skills.sh&utm_medium=skill&utm_campaign=face-swap) · [/feature/character-swap](https://www.runcomfy.com/models/feature/character-swap?utm_source=skills.sh&utm_medium=skill&utm_campaign=face-swap)

The featured RunComfy endpoint for character swap — supply a reference image of the new identity + the audio track the character should speak, and the model produces a video where the character drives.

### Invoke

runcomfy run community/wan-2-2-animate/api \

--input '{

"image_url": "https://your-cdn.example/new-character.png",

"audio_url": "https://your-cdn.example/voiceover.mp3"

}' \

--output-dir ./out


### Tips

- **Single reference image** drives the swap. Pick a clean, well-lit portrait of the target identity — front-facing if possible.

- **Audio drives the mouth and rhythm.** Without audio the character won't speak; without good audio sync degrades.

- Schema details: [model page](https://www.runcomfy.com/models/community/wan-2-2-animate/api?utm_source=skills.sh&utm_medium=skill&utm_campaign=face-swap).

## Route 2: Kling 2-6 Motion Control Pro — motion transfer

**Model**: `kling/kling-2-6/motion-control-pro`
**Catalog**: [motion-control-pro](https://www.runcomfy.com/models/kling/kling-2-6/motion-control-pro?utm_source=skills.sh&utm_medium=skill&utm_campaign=face-swap) · [kling collection](https://www.runcomfy.com/models/collections/kling?utm_source=skills.sh&utm_medium=skill&utm_campaign=face-swap)

Different from a pure face-swap: Motion Control takes a **reference performance video** (the motion you want) and a **target character image** (the identity you want), and produces a video of the target performing the reference motion. The face-swap effect is a byproduct.

### Invoke

runcomfy run kling/kling-2-6/motion-control-pro \

--input '{

"reference_video_url": "https://your-cdn.example/source-performance.mp4",

"character_image_url": "https://your-cdn.example/target-character.png"

}' \

--output-dir ./out


### When to pick this over Route 1

- You have a **source video whose motion / blocking you want preserved**, not just the audio.

- The target is a stylized character rather than a photoreal portrait — motion-control handles stylized identities cleanly.

## Route 3: GPT Image 2 Edit — still face swap with multi-ref

**Model**: `openai/gpt-image-2/edit`
**Catalog**: [gpt-image-2/edit](https://www.runcomfy.com/models/openai/gpt-image-2/edit?utm_source=skills.sh&utm_medium=skill&utm_campaign=face-swap)

For **still images**, GPT Image 2 Edit accepts up to **10 reference images** and follows precise compositional instructions — making it the strongest path for multi-ref face swap on a single output frame.

### Schema (relevant fields)

Field
Type
Required
Default
Notes

`prompt`
string
yes
—
Compositional instruction; quote roles explicitly

`images`
string[]
yes
—
Up to **10** HTTPS reference URLs. Image 1 is primary

`size`
enum
no
`auto`
`auto` (preserve input ratio), `1024_1024`, `1024_1536`, `1536_1024`

### Invoke

runcomfy run openai/gpt-image-2/edit \

--input '{

"prompt": "Replace the face of the person in image 1 with the face from image 2. Preserve image 1 pose, clothing, lighting, and background exactly. Match skin tone and lighting to image 1.",

"images": [

"https://your-cdn.example/target-scene.jpg",

"https://your-cdn.example/identity-face.jpg"

],

"size": "auto"

}' \

--output-dir ./out


### Prompting tips

- **Number the references** — `"image 1"`, `"image 2"` — and assign roles unambiguously.

- **Lead with what to preserve**, then the swap: `"Preserve pose, clothing, lighting, and background exactly. Replace only the face."`

- **Match lighting explicitly** — `"match skin tone and lighting to image 1"` — otherwise the imported face floats.

## Route 4: Nano Banana Edit — batch identity-preserving swap

**Model**: `google/nano-banana-2/edit`
**Catalog**: [nano-banana-2/edit](https://www.runcomfy.com/models/google/nano-banana-2/edit?utm_source=skills.sh&utm_medium=skill&utm_campaign=face-swap)

Pick this when the same identity needs to be swapped into **multiple frames consistently** — SKU shots, A/B variants, narrative panels.

### Invoke

runcomfy run google/nano-banana-2/edit \

--input '{

"prompt": "Replace the face in each image with the face shown in the first image. Keep all other elements — pose, clothing, lighting, background — unchanged.",

"image_urls": [

"https://your-cdn.example/identity-ref.jpg",

"https://your-cdn.example/scene-1.jpg",

"https://your-cdn.example/scene-2.jpg",

"https://your-cdn.example/scene-3.jpg"

],

"aspect_ratio": "auto",

"resolution": "1K"

}' \

--output-dir ./out


### Tips

- **1–20 input images per call.** First image is conventionally the identity reference; the rest are scenes to swap into.

- **Lock `aspect_ratio` and `resolution`** for batch consistency.

- See [image-edit](https://www.skills.sh/agentspace-so/runcomfy-agent-skills/image-edit) skill for the full Nano Banana Edit treatment.

## Route 5: Flux Kontext Pro — single-ref precise face edit

**Model**: `blackforestlabs/flux-1-kontext/pro/edit`
**Catalog**: [flux-kontext](https://www.runcomfy.com/models/collections/flux-kontext?utm_source=skills.sh&utm_medium=skill&utm_campaign=face-swap)

Flux Kontext is best when the swap is **one image, one declarative instruction, highest fidelity preservation of everything except the face**.

### Invoke

runcomfy run blackforestlabs/flux-1-kontext/pro/edit \

--input '{

"prompt": "Keep pose, clothing, hair, lighting, and background exactly. Change only the face to that of a 35-year-old woman with high cheekbones, hazel eyes, and a small scar above the right eyebrow.",

"image": "https://your-cdn.example/scene.jpg"

}' \

--output-dir ./out

BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card