fal-ai-media

Unified media generation via fal.ai MCP — image, video, and audio. Covers text-to-image (Nano Banana), text/image-to-video (Seedance, Kling, Veo 3),…

INSTALLATION
npx skills add https://github.com/affaan-m/everything-claude-code --skill fal-ai-media
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

fal.ai Media Generation

Drift-prone skill. fal.ai model IDs, pricing, inputs, and MCP tool names

change quickly. Search or fetch the current model metadata before promising a

specific model, parameter, output format, or cost.

Generate images, videos, and audio using fal.ai models via MCP.

When to Activate

  • User wants to generate images from text prompts
  • Creating videos from text or images
  • Generating speech, music, or sound effects
  • Any media generation task
  • User says "generate image", "create video", "text to speech", "make a thumbnail", or similar

MCP Requirement

fal.ai MCP server must be configured. Add to ~/.claude.json:

"fal-ai": {

  "command": "npx",

  "args": ["-y", "fal-ai-mcp-server"],

  "env": { "FAL_KEY": "YOUR_FAL_KEY_HERE" }

}

Get an API key at fal.ai.

MCP Tools

The fal.ai MCP provides these tools:

  • search — Find available models by keyword
  • find — Get model details and parameters
  • generate — Run a model with parameters
  • result — Check async generation status
  • status — Check job status
  • cancel — Cancel a running job
  • estimate_cost — Estimate generation cost
  • models — List popular models
  • upload — Upload files for use as inputs

Image Generation

Nano Banana 2 (Fast)

Best for: quick iterations, drafts, text-to-image, image editing.

generate(

  app_id: "fal-ai/nano-banana-2",

  input_data: {

    "prompt": "a futuristic cityscape at sunset, cyberpunk style",

    "image_size": "landscape_16_9",

    "num_images": 1,

    "seed": 42

  }

)

Nano Banana Pro (High Fidelity)

Best for: production images, realism, typography, detailed prompts.

generate(

  app_id: "fal-ai/nano-banana-pro",

  input_data: {

    "prompt": "professional product photo of wireless headphones on marble surface, studio lighting",

    "image_size": "square",

    "num_images": 1,

    "guidance_scale": 7.5

  }

)

Common Image Parameters

Param

Type

Options

Notes

prompt

string

required

Describe what you want

image_size

string

square, portrait_4_3, landscape_16_9, portrait_16_9, landscape_4_3

Aspect ratio

num_images

number

1-4

How many to generate

seed

number

any integer

Reproducibility

guidance_scale

number

1-20

How closely to follow the prompt (higher = more literal)

Image Editing

Use Nano Banana 2 with an input image for inpainting, outpainting, or style transfer:

# First upload the source image

upload(file_path: "/path/to/image.png")

# Then generate with image input

generate(

  app_id: "fal-ai/nano-banana-2",

  input_data: {

    "prompt": "same scene but in watercolor style",

    "image_url": "<uploaded_url>",

    "image_size": "landscape_16_9"

  }

)

Video Generation

Seedance 1.0 Pro (ByteDance)

Best for: text-to-video, image-to-video with high motion quality.

generate(

  app_id: "fal-ai/seedance-1-0-pro",

  input_data: {

    "prompt": "a drone flyover of a mountain lake at golden hour, cinematic",

    "duration": "5s",

    "aspect_ratio": "16:9",

    "seed": 42

  }

)

Kling Video v3 Pro

Best for: text/image-to-video with native audio generation.

generate(

  app_id: "fal-ai/kling-video/v3/pro",

  input_data: {

    "prompt": "ocean waves crashing on a rocky coast, dramatic clouds",

    "duration": "5s",

    "aspect_ratio": "16:9"

  }

)

Veo 3 (Google DeepMind)

Best for: video with generated sound, high visual quality.

generate(

  app_id: "fal-ai/veo-3",

  input_data: {

    "prompt": "a bustling Tokyo street market at night, neon signs, crowd noise",

    "aspect_ratio": "16:9"

  }

)

Image-to-Video

Start from an existing image:

generate(

  app_id: "fal-ai/seedance-1-0-pro",

  input_data: {

    "prompt": "camera slowly zooms out, gentle wind moves the trees",

    "image_url": "<uploaded_image_url>",

    "duration": "5s"

  }

)

Video Parameters

Param

Type

Options

Notes

prompt

string

required

Describe the video

duration

string

"5s", "10s"

Video length

aspect_ratio

string

"16:9", "9:16", "1:1"

Frame ratio

seed

number

any integer

Reproducibility

image_url

string

URL

Source image for image-to-video

Audio Generation

CSM-1B (Conversational Speech)

Text-to-speech with natural, conversational quality.

generate(

  app_id: "fal-ai/csm-1b",

  input_data: {

    "text": "Hello, welcome to the demo. Let me show you how this works.",

    "speaker_id": 0

  }

)

ThinkSound (Video-to-Audio)

Generate matching audio from video content.

generate(

  app_id: "fal-ai/thinksound",

  input_data: {

    "video_url": "<video_url>",

    "prompt": "ambient forest sounds with birds chirping"

  }

)

ElevenLabs (via API, no MCP)

For professional voice synthesis, use ElevenLabs directly:

import os

import requests

resp = requests.post(

    "https://api.elevenlabs.io/v1/text-to-speech/<voice_id>",

    headers={

        "xi-api-key": os.environ["ELEVENLABS_API_KEY"],

        "Content-Type": "application/json"

    },

    json={

        "text": "Your text here",

        "model_id": "eleven_turbo_v2_5",

        "voice_settings": {"stability": 0.5, "similarity_boost": 0.75}

    }

)

with open("output.mp3", "wb") as f:

    f.write(resp.content)

VideoDB Generative Audio

If VideoDB is configured, use its generative audio:

# Voice generation

audio = coll.generate_voice(text="Your narration here", voice="alloy")

# Music generation

music = coll.generate_music(prompt="upbeat electronic background music", duration=30)

# Sound effects

sfx = coll.generate_sound_effect(prompt="thunder crack followed by rain")

Cost Estimation

Before generating, check estimated cost:

estimate_cost(

  estimate_type: "unit_price",

  endpoints: {

    "fal-ai/nano-banana-pro": {

      "unit_quantity": 1

    }

  }

)

Model Discovery

Find models for specific tasks:

search(query: "text to video")

find(endpoint_ids: ["fal-ai/seedance-1-0-pro"])

models()

Tips

  • Use seed for reproducible results when iterating on prompts
  • Start with lower-cost models (Nano Banana 2) for prompt iteration, then switch to Pro for finals
  • For video, keep prompts descriptive but concise — focus on motion and scene
  • Image-to-video produces more controlled results than pure text-to-video
  • Check estimate_cost before running expensive video generations

Related Skills

  • videodb — Video processing, editing, and streaming
  • video-editing — AI-powered video editing workflows
  • content-engine — Content creation for social platforms
BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card