subtitle-generation

Generate subtitles and captions for videos using each::sense AI. Create auto-generated subtitles, multi-language captions, animated TikTok-style text, SRT/VTT…

INSTALLATION
npx skills add https://github.com/eachlabs/skills --skill subtitle-generation
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

$27

curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \

  -H "Content-Type: application/json" \

  -H "X-API-Key: $EACHLABS_API_KEY" \

  -H "Accept: text/event-stream" \

  -d '{

    "messages": [{"role": "user", "content": "Generate subtitles for this video with accurate timestamps"}],

    "model": "eachsense/beta",

    "stream": true,

    "mode": "max",

    "file_urls": ["https://example.com/my-video.mp4"]

  }'

Subtitle Formats & Outputs

Format

Extension

Use Case

SRT

.srt

Universal, most video players and editors

VTT

.vtt

Web video, HTML5 players, YouTube

Burned-In

.mp4

Social media, no player support needed

JSON

.json

Custom applications, programmatic access

ASS/SSA

.ass

Advanced styling, anime subtitles

Use Case Examples

1. Auto-Generate Subtitles from Video

Automatically transcribe speech from a video file with accurate word-level timestamps.

curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \

  -H "Content-Type: application/json" \

  -H "X-API-Key: $EACHLABS_API_KEY" \

  -H "Accept: text/event-stream" \

  -d '{

    "messages": [{"role": "user", "content": "Transcribe this video and generate subtitles with accurate timestamps. Output as SRT format. The video contains English speech."}],

    "model": "eachsense/beta",

    "stream": true,

    "mode": "max",

    "file_urls": ["https://example.com/interview-video.mp4"]

  }'

2. Multi-Language Subtitle Generation

Generate subtitles in multiple languages directly from the audio.

curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \

  -H "Content-Type: application/json" \

  -H "X-API-Key: $EACHLABS_API_KEY" \

  -H "Accept: text/event-stream" \

  -d '{

    "messages": [{"role": "user", "content": "Generate subtitles for this video in English, Spanish, and French. Provide separate SRT files for each language. The original audio is in English."}],

    "model": "eachsense/beta",

    "stream": true,

    "mode": "max",

    "file_urls": ["https://example.com/product-demo.mp4"]

  }'

3. Animated/Styled Captions (TikTok Style)

Create eye-catching animated captions with word-by-word highlighting, popular on TikTok and Instagram Reels.

curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \

  -H "Content-Type: application/json" \

  -H "X-API-Key: $EACHLABS_API_KEY" \

  -H "Accept: text/event-stream" \

  -d '{

    "messages": [{"role": "user", "content": "Add TikTok-style animated captions to this video. Use bold white text with black outline, word-by-word pop animation, centered at the bottom third of the screen. Make it trendy and engaging."}],

    "model": "eachsense/beta",

    "stream": true,

    "mode": "max",

    "file_urls": ["https://example.com/short-form-content.mp4"]

  }'

4. SRT/VTT Export

Generate clean subtitle files in standard formats for use in video editors or streaming platforms.

curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \

  -H "Content-Type: application/json" \

  -H "X-API-Key: $EACHLABS_API_KEY" \

  -H "Accept: text/event-stream" \

  -d '{

    "messages": [{"role": "user", "content": "Transcribe this video and export subtitles in both SRT and VTT formats. Ensure proper line breaks (max 42 characters per line, 2 lines max). Include timestamps accurate to milliseconds."}],

    "model": "eachsense/beta",

    "stream": true,

    "mode": "max",

    "file_urls": ["https://example.com/documentary.mp4"]

  }'

5. Speaker Diarization (Identify Speakers)

Generate subtitles that identify and label different speakers in conversations, interviews, or podcasts.

curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \

  -H "Content-Type: application/json" \

  -H "X-API-Key: $EACHLABS_API_KEY" \

  -H "Accept: text/event-stream" \

  -d '{

    "messages": [{"role": "user", "content": "Generate subtitles for this podcast with speaker diarization. There are 2 speakers - identify them as Speaker 1 and Speaker 2 (or Host and Guest if you can determine roles). Format each line with the speaker label."}],

    "model": "eachsense/beta",

    "stream": true,

    "mode": "max",

    "file_urls": ["https://example.com/podcast-episode.mp4"]

  }'

6. Subtitle Translation

Translate existing subtitles from one language to another while preserving timing.

curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \

  -H "Content-Type: application/json" \

  -H "X-API-Key: $EACHLABS_API_KEY" \

  -H "Accept: text/event-stream" \

  -d '{

    "messages": [{"role": "user", "content": "Translate these English subtitles to Japanese. Preserve the original timing and format. Ensure natural Japanese phrasing rather than literal translation."}],

    "model": "eachsense/beta",

    "stream": true,

    "mode": "max",

    "file_urls": ["https://example.com/original-subtitles.srt"]

  }'

7. Burned-In Subtitles

Render subtitles directly into the video file (hardcoded/embedded) so they appear without needing player support.

curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \

  -H "Content-Type: application/json" \

  -H "X-API-Key: $EACHLABS_API_KEY" \

  -H "Accept: text/event-stream" \

  -d '{

    "messages": [{"role": "user", "content": "Generate subtitles for this video and burn them directly into the video. Use white text with black background box, Arial font, positioned at bottom center. Output a new video file with embedded subtitles."}],

    "model": "eachsense/beta",

    "stream": true,

    "mode": "max",

    "file_urls": ["https://example.com/social-media-clip.mp4"]

  }'

8. Word-by-Word Karaoke Style

Create karaoke-style subtitles with word-by-word highlighting, perfect for music videos and lyric content.

curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \

  -H "Content-Type: application/json" \

  -H "X-API-Key: $EACHLABS_API_KEY" \

  -H "Accept: text/event-stream" \

  -d '{

    "messages": [{"role": "user", "content": "Create karaoke-style subtitles for this music video. Display lyrics with word-by-word highlighting as they are sung. Use a gradient color change effect (from white to yellow) for the currently sung word. Center the text on screen."}],

    "model": "eachsense/beta",

    "stream": true,

    "mode": "max",

    "file_urls": ["https://example.com/music-video.mp4"]

  }'

9. Subtitle Timing Adjustment

Fine-tune subtitle timing for better synchronization with audio.

# First, upload video and generate initial subtitles

curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \

  -H "Content-Type: application/json" \

  -H "X-API-Key: $EACHLABS_API_KEY" \

  -H "Accept: text/event-stream" \

  -d '{

    "messages": [{"role": "user", "content": "Generate subtitles for this video"}],

    "model": "eachsense/beta",

    "stream": true,

    "mode": "max",

    "session_id": "subtitle-timing-project",

    "file_urls": ["https://example.com/video-with-delay.mp4"]

  }'

# Then adjust timing in the same session

curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \

  -H "Content-Type: application/json" \

  -H "X-API-Key: $EACHLABS_API_KEY" \

  -H "Accept: text/event-stream" \

  -d '{

    "messages": [{"role": "user", "content": "The subtitles are appearing 500 milliseconds too early. Shift all subtitle timings forward by 500ms and regenerate the SRT file."}],

    "model": "eachsense/beta",

    "stream": true,

    "session_id": "subtitle-timing-project"

  }'

10. Batch Subtitle Generation

Generate subtitles for multiple videos in a single workflow.

curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \

  -H "Content-Type: application/json" \

  -H "X-API-Key: $EACHLABS_API_KEY" \

  -H "Accept: text/event-stream" \

  -d '{

    "messages": [{"role": "user", "content": "Generate English subtitles for all these videos. Output SRT files for each. Use consistent formatting across all videos: max 2 lines, 42 characters per line, minimum 1 second display time per subtitle."}],

    "model": "eachsense/beta",

    "stream": true,

    "mode": "max",

    "file_urls": [

      "https://example.com/episode-01.mp4",

      "https://example.com/episode-02.mp4",

      "https://example.com/episode-03.mp4"

    ]

  }'

Best Practices

Transcription Quality

  • Clear Audio: Best results with clear speech and minimal background noise
  • Language Hint: Specify the source language for better accuracy
  • Speaker Count: Mention number of speakers for better diarization
  • Context: Provide context about the content (technical terms, names) for accuracy

Subtitle Formatting

  • Line Length: Keep lines under 42 characters for readability
  • Duration: Each subtitle should display for 1-7 seconds
  • Lines Per Subtitle: Maximum 2 lines per subtitle block
  • Reading Speed: Target 150-180 words per minute for comfortable reading

Animated Captions

  • Font Choice: Bold, sans-serif fonts work best for short-form content
  • Contrast: Use outlines or shadows for visibility on any background
  • Position: Keep safe zones clear for platform UI elements
  • Animation: Subtle animations are more readable than dramatic effects

Translation

  • Cultural Adaptation: Request localization, not just translation
  • Timing Flexibility: Some languages need more time to read
  • Character Limits: CJK languages often need fewer characters per line

Prompt Tips for Subtitle Generation

When requesting subtitles, include these details in your prompt:

  • Source Language: What language is spoken in the video?
  • Target Format: SRT, VTT, burned-in, or animated?
  • Style Requirements: Font, color, position, animation
  • Speaker Info: Number of speakers, roles if known
  • Special Terms: Technical vocabulary, names, brands
  • Output Languages: Single language or multiple translations

Example Prompt Structure

"Generate [format] subtitles for this video.

Source language: [language].

[Number] speakers: [roles if known].

Style: [font, color, position requirements].

Special terms to recognize: [names, technical words].

Additional requirements: [line length, timing, etc.]"

Mode Selection

Ask your users before generating:

"Do you want fast & cheap, or high quality?"

Mode

Best For

Speed

Quality

max

Final subtitles, professional content, accuracy-critical

Slower

Highest

eco

Quick drafts, review copies, bulk processing

Faster

Good

Multi-Turn Subtitle Refinement

Use session_id to iterate on subtitle generation:

# Initial subtitle generation

curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \

  -H "Content-Type: application/json" \

  -H "X-API-Key: $EACHLABS_API_KEY" \

  -H "Accept: text/event-stream" \

  -d '{

    "messages": [{"role": "user", "content": "Generate subtitles for this video with speaker identification"}],

    "model": "eachsense/beta",

    "stream": true,

    "session_id": "subtitle-project-001",

    "file_urls": ["https://example.com/interview.mp4"]

  }'

# Refine based on feedback

curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \

  -H "Content-Type: application/json" \

  -H "X-API-Key: $EACHLABS_API_KEY" \

  -H "Accept: text/event-stream" \

  -d '{

    "messages": [{"role": "user", "content": "Change Speaker 1 label to John and Speaker 2 to Sarah. Also fix the spelling of TensorFlow wherever it appears."}],

    "model": "eachsense/beta",

    "stream": true,

    "session_id": "subtitle-project-001"

  }'

# Add styling and export

curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \

  -H "Content-Type: application/json" \

  -H "X-API-Key: $EACHLABS_API_KEY" \

  -H "Accept: text/event-stream" \

  -d '{

    "messages": [{"role": "user", "content": "Now create a burned-in version with the corrected subtitles. Use yellow text for John and cyan for Sarah."}],

    "model": "eachsense/beta",

    "stream": true,

    "session_id": "subtitle-project-001"

  }'

Language Support

each::sense supports subtitle generation in 50+ languages including:

Language

Code

Notes

English

en

US, UK, AU variants

Spanish

es

Latin American and European

French

fr

France and Canadian

German

de

Japanese

ja

Korean

ko

Chinese

zh

Simplified and Traditional

Arabic

ar

RTL support

Hindi

hi

Portuguese

pt

Brazilian and European

Error Handling

Error

Cause

Solution

Failed to create prediction: HTTP 422

Insufficient balance

Top up at eachlabs.ai

Transcription quality low

Poor audio quality

Provide cleaner audio source

Language detection failed

Mixed languages or unclear speech

Specify source language explicitly

Timeout

Long video or complex processing

Set client timeout to minimum 10 minutes

Client Configuration

Important: Subtitle generation can take significant time for long videos.

  • Minimum timeout: 10 minutes (600 seconds)
  • Recommended: Set timeout based on video length (2-3 minutes per minute of video)
  • Streaming: Use SSE event handling to show progress

Related Skills

  • each-sense - Core API documentation
  • video-generation - Generate videos with built-in captions
  • voice-audio - Audio processing and speech synthesis
  • video-edit - Video editing and post-production
BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card