ocr

OCR skill for extracting text from images and PDFs. Use when you need to read text from screenshots, photos, scanned documents, or any image file. Supports…

INSTALLATION
npx skills add https://github.com/mr-shaper/opencode-skills-paddle-ocr --skill ocr
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

OCR Skill

Usage

To extract text from an image or PDF, run:

python3 "/Users/mrshaper/Library/Application Support/com.differentai.openwork/workspaces/starter/.opencode/skills/paddle-ocr/scripts/ocr.py" "/path/to/image.png"

Options

Option

Description

--prompt "text"

Custom prompt (e.g., "Extract table as markdown")

--fast

Use faster PaddleOCR instead of DeepSeek-OCR

--json

Output as JSON format

Examples

# Basic OCR

python3 scripts/ocr.py image.png

# Extract table as markdown

python3 scripts/ocr.py table.png --prompt "Extract this table as markdown"

# Fast mode

python3 scripts/ocr.py image.png --fast

# PDF OCR

python3 scripts/ocr.py document.pdf

Supported Formats

Images: PNG, JPG, JPEG, BMP, GIF, WEBP, TIFF

Documents: PDF

BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card