SKILL.md

fal.ai Media Generation

Drift-prone skill. fal.ai model IDs, pricing, inputs, and MCP tool names

change quickly. Search or fetch the current model metadata before promising a

specific model, parameter, output format, or cost.

Generate images, videos, and audio using fal.ai models via MCP.

When to Activate

User wants to generate images from text prompts

Creating videos from text or images

Generating speech, music, or sound effects

Any media generation task

User says "generate image", "create video", "text to speech", "make a thumbnail", or similar

MCP Requirement

fal.ai MCP server must be configured. Add to ~/.claude.json:

"fal-ai": {

  "command": "npx",

  "args": ["-y", "fal-ai-mcp-server"],

  "env": { "FAL_KEY": "YOUR_FAL_KEY_HERE" }

}

Get an API key at fal.ai.

MCP Tools

The fal.ai MCP provides these tools:

search — Find available models by keyword

find — Get model details and parameters

generate — Run a model with parameters

result — Check async generation status

status — Check job status

cancel — Cancel a running job

estimate_cost — Estimate generation cost

models — List popular models

upload — Upload files for use as inputs

Image Generation

Nano Banana 2 (Fast)

Best for: quick iterations, drafts, text-to-image, image editing.

generate(

  app_id: "fal-ai/nano-banana-2",

  input_data: {

    "prompt": "a futuristic cityscape at sunset, cyberpunk style",

    "image_size": "landscape_16_9",

    "num_images": 1,

    "seed": 42

  }

)

Nano Banana Pro (High Fidelity)

Best for: production images, realism, typography, detailed prompts.

generate(

  app_id: "fal-ai/nano-banana-pro",

  input_data: {

    "prompt": "professional product photo of wireless headphones on marble surface, studio lighting",

    "image_size": "square",

    "num_images": 1,

    "guidance_scale": 7.5

  }

)

Common Image Parameters

Param

Type

Options

Notes

prompt

string

required

Describe what you want

image_size

string

square, portrait_4_3, landscape_16_9, portrait_16_9, landscape_4_3

Aspect ratio

num_images

number

1-4

How many to generate

seed

number

any integer

Reproducibility

guidance_scale

number

1-20

How closely to follow the prompt (higher = more literal)

Image Editing

Use Nano Banana 2 with an input image for inpainting, outpainting, or style transfer:

# First upload the source image

upload(file_path: "/path/to/image.png")

# Then generate with image input

generate(

  app_id: "fal-ai/nano-banana-2",

  input_data: {

    "prompt": "same scene but in watercolor style",

    "image_url": "<uploaded_url>",

    "image_size": "landscape_16_9"

  }

)

Video Generation

Seedance 1.0 Pro (ByteDance)

Best for: text-to-video, image-to-video with high motion quality.

generate(

  app_id: "fal-ai/seedance-1-0-pro",

  input_data: {

    "prompt": "a drone flyover of a mountain lake at golden hour, cinematic",

    "duration": "5s",

    "aspect_ratio": "16:9",

    "seed": 42

  }

)

Kling Video v3 Pro

Best for: text/image-to-video with native audio generation.

generate(

  app_id: "fal-ai/kling-video/v3/pro",

  input_data: {

    "prompt": "ocean waves crashing on a rocky coast, dramatic clouds",

    "duration": "5s",

    "aspect_ratio": "16:9"

  }

)

Veo 3 (Google DeepMind)

Best for: video with generated sound, high visual quality.

generate(

  app_id: "fal-ai/veo-3",

  input_data: {

    "prompt": "a bustling Tokyo street market at night, neon signs, crowd noise",

    "aspect_ratio": "16:9"

  }

)

Image-to-Video

Start from an existing image:

generate(

  app_id: "fal-ai/seedance-1-0-pro",

  input_data: {

    "prompt": "camera slowly zooms out, gentle wind moves the trees",

    "image_url": "<uploaded_image_url>",

    "duration": "5s"

  }

)

Video Parameters

Param

Type

Options

Notes

prompt

string

required

Describe the video

duration

string

"5s", "10s"

Video length

aspect_ratio

string

"16:9", "9:16", "1:1"

Frame ratio

seed

number

any integer

Reproducibility

image_url

string

URL

Source image for image-to-video

Audio Generation

CSM-1B (Conversational Speech)

Text-to-speech with natural, conversational quality.

generate(

  app_id: "fal-ai/csm-1b",

  input_data: {

    "text": "Hello, welcome to the demo. Let me show you how this works.",

    "speaker_id": 0

  }

)

ThinkSound (Video-to-Audio)

Generate matching audio from video content.

generate(

  app_id: "fal-ai/thinksound",

  input_data: {

    "video_url": "<video_url>",

    "prompt": "ambient forest sounds with birds chirping"

  }

)

ElevenLabs (via API, no MCP)

For professional voice synthesis, use ElevenLabs directly:

import os

import requests

resp = requests.post(

    "https://api.elevenlabs.io/v1/text-to-speech/<voice_id>",

    headers={

        "xi-api-key": os.environ["ELEVENLABS_API_KEY"],

        "Content-Type": "application/json"

    },

    json={

        "text": "Your text here",

        "model_id": "eleven_turbo_v2_5",

        "voice_settings": {"stability": 0.5, "similarity_boost": 0.75}

    }

)

with open("output.mp3", "wb") as f:

    f.write(resp.content)

VideoDB Generative Audio

If VideoDB is configured, use its generative audio:

# Voice generation

audio = coll.generate_voice(text="Your narration here", voice="alloy")

# Music generation

music = coll.generate_music(prompt="upbeat electronic background music", duration=30)

# Sound effects

sfx = coll.generate_sound_effect(prompt="thunder crack followed by rain")

Cost Estimation

Before generating, check estimated cost:

estimate_cost(

  estimate_type: "unit_price",

  endpoints: {

    "fal-ai/nano-banana-pro": {

      "unit_quantity": 1

    }

  }

)

Model Discovery

Find models for specific tasks:

search(query: "text to video")

find(endpoint_ids: ["fal-ai/seedance-1-0-pro"])

models()

Tips

Use seed for reproducible results when iterating on prompts

Start with lower-cost models (Nano Banana 2) for prompt iteration, then switch to Pro for finals

For video, keep prompts descriptive but concise — focus on motion and scene

Image-to-video produces more controlled results than pure text-to-video

Check estimate_cost before running expensive video generations

Related Skills

videodb — Video processing, editing, and streaming

video-editing — AI-powered video editing workflows

content-engine — Content creation for social platforms

fal-ai-media

SKILL.md

fal.ai Media Generation

When to Activate

MCP Requirement

MCP Tools

Image Generation

Nano Banana 2 (Fast)

Nano Banana Pro (High Fidelity)

Common Image Parameters

Image Editing

Video Generation

Seedance 1.0 Pro (ByteDance)

Kling Video v3 Pro

Veo 3 (Google DeepMind)

Image-to-Video

Video Parameters

Audio Generation

CSM-1B (Conversational Speech)

ThinkSound (Video-to-Audio)

ElevenLabs (via API, no MCP)

VideoDB Generative Audio

Cost Estimation

Model Discovery

Tips

Related Skills

Stop writing automation&scrapers

fal-ai-media

SKILL.md

fal.ai Media Generation

When to Activate

MCP Requirement

MCP Tools

Image Generation

Nano Banana 2 (Fast)

Nano Banana Pro (High Fidelity)

Common Image Parameters

Image Editing

Video Generation

Seedance 1.0 Pro (ByteDance)

Kling Video v3 Pro

Veo 3 (Google DeepMind)

Image-to-Video

Video Parameters

Audio Generation

CSM-1B (Conversational Speech)

ThinkSound (Video-to-Audio)

ElevenLabs (via API, no MCP)

VideoDB Generative Audio

Cost Estimation

Model Discovery

Tips

Related Skills

Let your agent run on any real-world website

Related skills

Stop writing automation&scrapers