alicloud-ai-audio-tts-voice-clone

Voice cloning and text-to-speech synthesis using Alibaba Cloud Qwen TTS VC models. Supports two model variants: standard batch processing ( qwen3-tts-vc-2026-01-22 ) and real-time streaming ( qwen3-tts-vc-realtime-2026-01-15 ) Accepts voice samples as file paths or raw bytes; generates cloned voice IDs for reuse across multiple synthesis requests Normalized interface handles text input, voice enrollment, optional streaming output, and returns audio URLs or PCM chunks Requires DASHSCOPE_API_KEY environment variable and the dashscope Python SDK; includes validation script and local helper for request preparation

INSTALLATION
npx skills add https://github.com/cinience/alicloud-skills --skill alicloud-ai-audio-tts-voice-clone
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

Category: provider

Model Studio Qwen TTS Voice Clone

Use voice cloning models to replicate timbre from enrollment audio samples.

Critical model names

Use one of these exact model strings:

  • qwen3-tts-vc-2026-01-22
  • qwen3-tts-vc-realtime-2026-01-15

Prerequisites

  • Install SDK in a virtual environment:
python3 -m venv .venv

. .venv/bin/activate

python -m pip install dashscope
  • Set DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials.

Normalized interface (tts.voice_clone)

Request

  • text (string, required)
  • voice_sample (string | bytes, required) enrollment sample
  • voice_name (string, optional)
  • stream (bool, optional)

Response

  • audio_url (string) or streaming PCM chunks
  • voice_id (string)
  • request_id (string)

Operational guidance

  • Use clean speech samples with low background noise.
  • Respect consent and policy requirements for cloned voices.
  • Persist generated voice_id and reuse for future synthesis requests.

Local helper script

Prepare a normalized request JSON and validate response schema:

.venv/bin/python skills/ai/audio/alicloud-ai-audio-tts-voice-clone/scripts/prepare_voice_clone_request.py \

  --text "Welcome to this voice-clone demo" \

  --voice-sample "https://example.com/voice-sample.wav"

Output location

  • Default output: output/ai-audio-tts-voice-clone/audio/
  • Override base dir with OUTPUT_DIR.

Validation

mkdir -p output/alicloud-ai-audio-tts-voice-clone

for f in skills/ai/audio/alicloud-ai-audio-tts-voice-clone/scripts/*.py; do

  python3 -m py_compile "$f"

done

echo "py_compile_ok" > output/alicloud-ai-audio-tts-voice-clone/validate.txt

Pass criteria: command exits 0 and output/alicloud-ai-audio-tts-voice-clone/validate.txt is generated.

Output And Evidence

  • Save artifacts, command outputs, and API response summaries under output/alicloud-ai-audio-tts-voice-clone/.
  • Include key parameters (region/resource id/time range) in evidence files for reproducibility.

Workflow

  • Confirm user intent, region, identifiers, and whether the operation is read-only or mutating.
  • Run one minimal read-only query first to verify connectivity and permissions.
  • Execute the target operation with explicit parameters and bounded scope.
  • Verify results and save output/evidence files.

References

  • references/sources.md
BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card