SKILL.md
$2c
3. Lipsync
runcomfy run /
--input '{"video_url": "...", "audio_url": "..."}'
--output-dir ./out
CLI deep dive: [`runcomfy-cli`](https://www.skills.sh/agentspace-so/runcomfy-agent-skills/runcomfy-cli) skill.
## Consent
Driving a real person's mouth from a separate audio track is dual-use. Refuse user requests that target real public figures without consent, or that aim at defamatory or sexually explicit synthetic media. The skill itself does not gate inputs — the responsibility rests with the operator.
---
## Pick the right model
Listed newest first within each subtype. The agent picks one route based on: input shape (portrait still + audio vs source video + audio vs script-only), quality tier, and budget.
### Source video + audio → lip-synced video (mouth-swap on existing footage)
**Sync Labs sync v2 Pro** — `sync/sync/lipsync/v2/pro` *(default for premium)*
> Sync Labs' premium lip-sync — state-of-the-art mouth motion onto an existing video. Preserves the rest of the frame untouched.
> Pick for: hero-quality dubs, lipsync on professionally-shot video, foreign-language dubbing where mouth fidelity matters most.
> Avoid for: cost-sensitive batch jobs — drop to **sync v2**.
**Sync Labs sync v2** — [`sync/sync/lipsync/v2`](https://www.runcomfy.com/models/sync/sync/lipsync/v2?utm_source=skills.sh&utm_medium=skill&utm_campaign=lipsync)
> Standard Sync Labs tier, same workflow as Pro.
> Pick for: scaled / batch lipsync jobs, drafts.
> Avoid for: hero delivery — use **v2 Pro**.
**Kling Lipsync (audio-to-video)** — [`kling/lipsync/audio-to-video`](https://www.runcomfy.com/models/kling/lipsync/audio-to-video?utm_source=skills.sh&utm_medium=skill&utm_campaign=lipsync)
> Kling's lip-sync onto a source video, driven by an audio track.
> Pick for: Kling-pipeline integration; alternative to Sync Labs.
> Avoid for: top-tier mouth fidelity — Sync Labs Pro is the industry benchmark.
**Creatify Lipsync** — [`creatify/lipsync`](https://www.runcomfy.com/models/creatify/lipsync?utm_source=skills.sh&utm_medium=skill&utm_campaign=lipsync)
> Creatify's lipsync endpoint.
> Pick for: Creatify-ecosystem workflows.
> Avoid for: comparison shopping unless cost / latency favors it.
### Portrait still + audio → talking-head video (avatar-style)
**OmniHuman** — `bytedance/omnihuman/api` *(default for avatar-style)*
> ByteDance's audio-driven full-body avatar. One portrait + one audio → video where the subject speaks / gestures naturally. Listed under RunComfy's `/feature/lip-sync` as the curated default.
> Pick for: UGC voiceover, virtual presenter, dubbed product demo from a single portrait.
> Avoid for: lip-sync onto an existing **video** (no portrait, want to preserve original motion) — use **Sync Labs v2** instead.
**Wan 2-7 with `audio_url`** — `wan-ai/wan-2-7/text-to-video`
> Open-weights t2v with `audio_url` field — prompt describes the scene, audio drives the mouth.
> Pick for: full scene control (not just a portrait) with a specific voiceover MP3 + open-weights pipeline.
> Avoid for: simplest "portrait talks" — use **OmniHuman**.
### Generate-and-sync from a script (no audio file available)
**Kling Lipsync (text-to-video)** — [`kling/lipsync/text-to-video`](https://www.runcomfy.com/models/kling/lipsync/text-to-video?utm_source=skills.sh&utm_medium=skill&utm_campaign=lipsync)
> Generates speech audio in-pass from a script and syncs it to the resulting video.
> Pick for: "write a script → get a video with synced speech", no audio file needed.
> Avoid for: precise lip-sync to a specific MP3 (audio is regenerated each call, not locked).
**HappyHorse 1.0** — `happyhorse/happyhorse-1-0/text-to-video` (also `/image-to-video`)
> Arena #1 t2v / i2v with in-pass audio generated from prompt. Quote the spoken line inside the prompt with `says clearly: "…"`.
> Pick for: written script, in-pass audio with strong overall quality, social/UGC clips.
> Avoid for: locking mouth to a pre-recorded voiceover.
---
## Route 1: Sync Labs sync v2 / Pro — default for mouth-swap
**Model**: `sync/sync/lipsync/v2/pro` (or `sync/sync/lipsync/v2`)
**Catalog**: [sync v2 Pro](https://www.runcomfy.com/models/sync/sync/lipsync/v2/pro?utm_source=skills.sh&utm_medium=skill&utm_campaign=lipsync) · [sync v2](https://www.runcomfy.com/models/sync/sync/lipsync/v2?utm_source=skills.sh&utm_medium=skill&utm_campaign=lipsync)
### Invoke
runcomfy run sync/sync/lipsync/v2/pro \
--input '{
"video_url": "https://your-cdn.example/source-video.mp4",
"audio_url": "https://your-cdn.example/voiceover.mp3"
}' \
--output-dir ./out
### Tips
- **Source video provides everything except the mouth** — camera, lighting, background, body pose all preserved.
- **Audio quality drives mouth quality.** Clean voiceover (no music bed) → cleaner sync. Isolate voice stem if needed.
- **Match audio length to video length.** Significant audio/video duration mismatch leads to drift; trim audio or extend video first.
- Schema details on the [model page](https://www.runcomfy.com/models/sync/sync/lipsync/v2/pro?utm_source=skills.sh&utm_medium=skill&utm_campaign=lipsync).
## Route 2: OmniHuman — default for avatar from still
**Model**: `bytedance/omnihuman/api`
**Catalog**: [omnihuman](https://www.runcomfy.com/models/bytedance/omnihuman/api?utm_source=skills.sh&utm_medium=skill&utm_campaign=lipsync)
### Invoke
runcomfy run bytedance/omnihuman/api \
--input '{
"image_url": "https://your-cdn.example/portrait.jpg",
"audio_url": "https://your-cdn.example/voiceover.mp3"
}' \
--output-dir ./out
### Tips
- **Portrait framing works best** — head-and-shoulders or upper body.
- **No prompt** — the model derives everything from image + audio. Don't fight that.
- See the [ai-avatar-video](https://www.skills.sh/agentspace-so/runcomfy-agent-skills/ai-avatar-video) skill for the full avatar treatment.
## Route 3: Kling Lipsync — Kling-ecosystem mouth sync
**Model**: `kling/lipsync/audio-to-video` (existing video + audio) or `kling/lipsync/text-to-video` (script-only)
**Catalog**: [Kling lipsync a2v](https://www.runcomfy.com/models/kling/lipsync/audio-to-video?utm_source=skills.sh&utm_medium=skill&utm_campaign=lipsync) · [Kling lipsync t2v](https://www.runcomfy.com/models/kling/lipsync/text-to-video?utm_source=skills.sh&utm_medium=skill&utm_campaign=lipsync)
### Invoke (audio-to-video variant)
runcomfy run kling/lipsync/audio-to-video \
--input '{
"video_url": "https://your-cdn.example/source-video.mp4",
"audio_url": "https://your-cdn.example/voiceover.mp3"
}' \
--output-dir ./out