nutrient-document-processing

Document conversion, extraction, OCR, redaction, signing, and form-filling via the Nutrient DWS API. Converts between 15+ formats including PDF, DOCX, XLSX, PPTX, HTML, and images (JPG, PNG, TIFF, WebP, SVG, and more) Extracts plain text and tables from documents; OCR supports 100+ languages for scanned PDFs and images Redacts PII using preset patterns (SSN, email, credit card, phone, date, URL, IP, MAC address, ZIP code, VIN) or custom regex Adds watermarks, applies digital CMS signatures, and fills PDF form fields programmatically Available as REST API or MCP server for native tool integration

INSTALLATION
npx skills add https://github.com/affaan-m/everything-claude-code --skill nutrient-document-processing
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

$2a

# DOCX to PDF

curl -X POST https://api.nutrient.io/build \

  -H "Authorization: Bearer $NUTRIENT_API_KEY" \

  -F "document.docx=@document.docx" \

  -F 'instructions={"parts":[{"file":"document.docx"}]}' \

  -o output.pdf

# PDF to DOCX

curl -X POST https://api.nutrient.io/build \

  -H "Authorization: Bearer $NUTRIENT_API_KEY" \

  -F "document.pdf=@document.pdf" \

  -F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"docx"}}' \

  -o output.docx

# HTML to PDF

curl -X POST https://api.nutrient.io/build \

  -H "Authorization: Bearer $NUTRIENT_API_KEY" \

  -F "index.html=@index.html" \

  -F 'instructions={"parts":[{"html":"index.html"}]}' \

  -o output.pdf

Supported inputs: PDF, DOCX, XLSX, PPTX, DOC, XLS, PPT, PPS, PPSX, ODT, RTF, HTML, JPG, PNG, TIFF, HEIC, GIF, WebP, SVG, TGA, EPS.

Extract Text and Data

# Extract plain text

curl -X POST https://api.nutrient.io/build \

  -H "Authorization: Bearer $NUTRIENT_API_KEY" \

  -F "document.pdf=@document.pdf" \

  -F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"text"}}' \

  -o output.txt

# Extract tables as Excel

curl -X POST https://api.nutrient.io/build \

  -H "Authorization: Bearer $NUTRIENT_API_KEY" \

  -F "document.pdf=@document.pdf" \

  -F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"xlsx"}}' \

  -o tables.xlsx

OCR Scanned Documents

# OCR to searchable PDF (supports 100+ languages)

curl -X POST https://api.nutrient.io/build \

  -H "Authorization: Bearer $NUTRIENT_API_KEY" \

  -F "scanned.pdf=@scanned.pdf" \

  -F 'instructions={"parts":[{"file":"scanned.pdf"}],"actions":[{"type":"ocr","language":"english"}]}' \

  -o searchable.pdf

Languages: Supports 100+ languages via ISO 639-2 codes (e.g., eng, deu, fra, spa, jpn, kor, chi_sim, chi_tra, ara, hin, rus). Full language names like english or german also work. See the complete OCR language table for all supported codes.

Redact Sensitive Information

# Pattern-based (SSN, email)

curl -X POST https://api.nutrient.io/build \

  -H "Authorization: Bearer $NUTRIENT_API_KEY" \

  -F "document.pdf=@document.pdf" \

  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"preset","strategyOptions":{"preset":"social-security-number"}},{"type":"redaction","strategy":"preset","strategyOptions":{"preset":"email-address"}}]}' \

  -o redacted.pdf

# Regex-based

curl -X POST https://api.nutrient.io/build \

  -H "Authorization: Bearer $NUTRIENT_API_KEY" \

  -F "document.pdf=@document.pdf" \

  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"regex","strategyOptions":{"regex":"\\b[A-Z]{2}\\d{6}\\b"}}]}' \

  -o redacted.pdf

Presets: social-security-number, email-address, credit-card-number, international-phone-number, north-american-phone-number, date, time, url, ipv4, ipv6, mac-address, us-zip-code, vin.

Add Watermarks

curl -X POST https://api.nutrient.io/build \

  -H "Authorization: Bearer $NUTRIENT_API_KEY" \

  -F "document.pdf=@document.pdf" \

  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"watermark","text":"CONFIDENTIAL","fontSize":72,"opacity":0.3,"rotation":-45}]}' \

  -o watermarked.pdf

Digital Signatures

# Self-signed CMS signature

curl -X POST https://api.nutrient.io/build \

  -H "Authorization: Bearer $NUTRIENT_API_KEY" \

  -F "document.pdf=@document.pdf" \

  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"sign","signatureType":"cms"}]}' \

  -o signed.pdf

Fill PDF Forms

curl -X POST https://api.nutrient.io/build \

  -H "Authorization: Bearer $NUTRIENT_API_KEY" \

  -F "form.pdf=@form.pdf" \

  -F 'instructions={"parts":[{"file":"form.pdf"}],"actions":[{"type":"fillForm","formFields":{"name":"Jane Smith","email":"jane@example.com","date":"2026-02-06"}}]}' \

  -o filled.pdf

MCP Server (Alternative)

For native tool integration, use the MCP server instead of curl:

{

  "mcpServers": {

    "nutrient-dws": {

      "command": "npx",

      "args": ["-y", "@nutrient-sdk/dws-mcp-server"],

      "env": {

        "NUTRIENT_DWS_API_KEY": "YOUR_API_KEY",

        "SANDBOX_PATH": "/path/to/working/directory"

      }

    }

  }

}

When to Use

  • Converting documents between formats (PDF, DOCX, XLSX, PPTX, HTML, images)
  • Extracting text, tables, or key-value pairs from PDFs
  • OCR on scanned documents or images
  • Redacting PII before sharing documents
  • Adding watermarks to drafts or confidential documents
  • Digitally signing contracts or agreements
  • Filling PDF forms programmatically

Links

BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card