SKILL.md
$27
I will produce the appropriate lit CLI command or TypeScript script, and once approved, report the results.
Then wait for the user's input.
---
## Step 0 — Install LiteParse (if needed)
If `liteparse` is not yet installed, install it globally:
npm i -g @llamaindex/liteparse
Verify installation:
lit --version
For Office document support (DOCX, PPTX, XLSX), LibreOffice is required:
macOS
brew install --cask libreoffice
Ubuntu/Debian
apt-get install libreoffice
For image parsing, ImageMagick is required:
macOS
brew install imagemagick
Ubuntu/Debian
apt-get install imagemagick
## Step 1 — Produce the CLI Command or Script
### Parse a Single File
Basic text extraction
lit parse document.pdf
JSON output saved to a file
lit parse document.pdf --format json -o output.json
Specific page range
lit parse document.pdf --target-pages "1-5,10,15-20"
Disable OCR (faster, text-only PDFs)
lit parse document.pdf --no-ocr
Use an external HTTP OCR server for higher accuracy
lit parse document.pdf --ocr-server-url http://localhost:8828/ocr
Higher DPI for better quality
lit parse document.pdf --dpi 300
### Batch Parse a Directory
lit batch-parse ./input-directory ./output-directory
Only process PDFs, recursively
lit batch-parse ./input ./output --extension .pdf --recursive
### Generate Page Screenshots
Screenshots are useful for LLM agents that need to see visual layout.
All pages
lit screenshot document.pdf -o ./screenshots
Specific pages
lit screenshot document.pdf --pages "1,3,5" -o ./screenshots
High-DPI PNG
lit screenshot document.pdf --dpi 300 --format png -o ./screenshots
Page range
lit screenshot document.pdf --pages "1-10" -o ./screenshots
## Step 3 — Key Options Reference
### OCR Options
Option
Description
(default)
Tesseract.js — zero setup, built-in
`--ocr-language fra`
Set OCR language (ISO code)
`--ocr-server-url <url>`
Use external HTTP OCR server (EasyOCR, PaddleOCR, custom)
`--no-ocr`
Disable OCR entirely
### Output Options
Option
Description
`--format json`
Structured JSON with bounding boxes
`--format text`
Plain text (default)
`-o <file>`
Save output to file
### Performance / Quality Options
Option
Description
`--dpi <n>`
Rendering DPI (default: 150; use 300 for high quality)
`--max-pages <n>`
Limit pages parsed
`--target-pages <pages>`
Parse specific pages (e.g. `"1-5,10"`)
`--no-precise-bbox`
Disable precise bounding boxes (faster)
`--skip-diagonal-text`
Ignore rotated/diagonal text
`--preserve-small-text`
Keep very small text that would otherwise be dropped
## Step 4 — Using a Config File
For repeated use with consistent options, generate a `liteparse.config.json`:
{
"ocrLanguage": "en",
"ocrEnabled": true,
"maxPages": 1000,
"dpi": 150,
"outputFormat": "json",
"preciseBoundingBox": true,
"skipDiagonalText": false,
"preserveVerySmallText": false
}
For an HTTP OCR server:
{
"ocrServerUrl": "http://localhost:8828/ocr",
"ocrLanguage": "en",
"outputFormat": "json"
}
Use with:
lit parse document.pdf --config liteparse.config.json
## Step 5 — HTTP OCR Server API (Advanced)
If the user wants to plug in a custom OCR backend, the server must implement:
- **Endpoint**: `POST /ocr`
- **Accepts**: `file` (multipart) and `language` (string) parameters
- **Returns**:
{
"results": [
{ "text": "Hello", "bbox": [x1, y1, x2, y2], "confidence": 0.98 }
]
}