SKILL.md
When to Use
- User provides a URL and wants to extract/read its content
- Another skill needs to parse source material from a URL before generation
- User says "parse this URL", "extract content from this link"
- User says "解析链接", "提取内容"
When NOT to Use
- User already has text content and doesn't need URL parsing
- User wants to generate audio/video content (not content extraction)
- User wants to read a local file (use standard file reading tools)
Purpose
Extract and normalize content from URLs across supported platforms. Returns structured data including content body, metadata, and references. Useful as a preprocessing step for content generation skills or standalone content extraction.
Hard Constraints
- No shell scripts. Construct curl commands from the API Reference (Inlined) section below
- See § API Reference (Inlined) below for API key and headers
- See § API Reference (Inlined) below for polling, errors, and interaction patterns
- URL must be a valid HTTP(S) URL
- Always read config following
shared/config-pattern.mdbefore any interaction
- Never save files to
~/Downloads/or.listenhub/— save to the current working directory
Step -1: API Key Check
Follow shared/config-pattern.md § API Key Check. If the key is missing, stop immediately.
Step 0: Config Setup
Follow shared/config-pattern.md Step 0 (Zero-Question Boot).
If file doesn't exist — silently create with defaults and proceed:
mkdir -p ".listenhub/content-parser"
echo '{"autoDownload":true}' > ".listenhub/content-parser/config.json"
CONFIG_PATH=".listenhub/content-parser/config.json"
CONFIG=$(cat "$CONFIG_PATH")
Do NOT ask any setup questions. Proceed directly to the Interaction Flow.
If file exists — read config silently and proceed:
CONFIG_PATH=".listenhub/content-parser/config.json"
[ ! -f "$CONFIG_PATH" ] && CONFIG_PATH="$HOME/.listenhub/content-parser/config.json"
CONFIG=$(cat "$CONFIG_PATH")
Setup Flow (user-initiated reconfigure only)
Only run when the user explicitly asks to reconfigure. Display current settings:
当前配置 (content-parser):
自动下载:{是 / 否}
Then ask:
- autoDownload: "自动保存提取的内容到当前目录?"
- "是(推荐)" →
autoDownload: true
- "否" →
autoDownload: false
Save immediately:
NEW_CONFIG=$(echo "$CONFIG" | jq --argjson dl {true/false} '. + {"autoDownload": $dl}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"
CONFIG=$(cat "$CONFIG_PATH")
Interaction Flow
Step 1: URL Input
Free text input. Ask the user:
What URL would you like to extract content from?
Step 2: Options (optional)
Ask if the user wants to configure extraction options:
Question: "Do you want to configure extraction options?"
Options:
- "No, use defaults" — Extract with default settings
- "Yes, configure options" — Set summarize, maxLength, or Twitter tweet count
If "Yes", ask follow-up questions:
- Summarize: "Generate a summary of the content?" (Yes/No)
- Max Length: "Set maximum content length?" (Free text, e.g., "5000")
- Twitter count (only if URL is Twitter/X profile): "How many tweets to fetch?" (1-100, default 20)
Step 3: Confirm & Extract
Summarize:
Ready to extract content:
URL: {url}
Options: {summarize: true, maxLength: 5000, twitter.count: 50} / default
Proceed?
Wait for explicit confirmation before calling the API.
Workflow
-
Validate URL: Must be HTTP(S). Normalize if needed (see references/supported-platforms.md)
-
Build request body:
{
"source": {
"type": "url",
"uri": "{url}"
},
"options": {
"summarize": true/false,
"maxLength": 5000,
"twitter": {
"count": 50
}
}
}
Omit options if user chose defaults.
-
Submit (foreground): POST /v1/content/extract → extract taskId
-
Tell the user extraction is in progress
-
Poll (background): Run the following exact bash command with run_in_background: true and timeout: 300000. Note: status field is .data.status (not processStatus), interval is 5s, values are processing/completed/failed:
TASK_ID="<id-from-step-3>"
for i in $(seq 1 60); do
RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/content/extract/$TASK_ID" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "X-Source: skills" 2>/dev/null)
STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.status // "processing"')
case "$STATUS" in
completed) echo "$RESULT"; exit 0 ;;
failed) echo "FAILED: $RESULT" >&2; exit 1 ;;
*) sleep 5 ;;
esac
done
echo "TIMEOUT" >&2; exit 2
-
When notified, download and present result:
If autoDownload is true, generate a slug from the extracted title (falling back to domain name if no title). Follow shared/config-pattern.md § Artifact Naming for slug generation and dedup.
- Write
{slug}.mdto the current directory — full extracted content in markdown
- Write
{slug}.jsonto the current directory — full raw API response data
SLUG="{title-slug}" # e.g. "topology-wikipedia"
# Dedup: check if files exist
BASE="$SLUG"; i=2
while [ -e "${SLUG}.md" ] || [ -e "${SLUG}.json" ]; do SLUG="${BASE}-${i}"; i=$((i+1)); done
echo "$CONTENT_MD" > "${SLUG}.md"
echo "$RESULT" > "${SLUG}.json"
Present:
内容提取完成!
来源:{url}
标题:{metadata.title}
长度:~{character count} 字符
消耗积分:{credits}
已保存到当前目录:
{slug}.md
{slug}.json
-
Show a preview of the extracted content (first ~500 chars)
-
Offer to use content in another skill (e.g. /podcast, /tts)
Estimated time: 10-30 seconds depending on content size and platform.
API Reference (Inlined)
Authentication
Environment variable: LISTENHUB_API_KEY (format: lh_sk_...)
Store in ~/.zshrc (macOS) or ~/.bashrc (Linux):
export LISTENHUB_API_KEY="lh_sk_..."
How to obtain: Visit https://listenhub.ai/settings/api-keys (Pro plan required).
Base URL: https://api.marswave.ai/openapi/v1
Required headers (every request):
Authorization: Bearer $LISTENHUB_API_KEY
Content-Type: application/json
X-Source: skills
The X-Source: skills header identifies requests as coming from Claude Code skills (CLI tool).
curl template:
curl -sS -X POST "https://api.marswave.ai/openapi/v1/{endpoint}" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-H "X-Source: skills" \
-d '{ ... }'
For GET requests, omit -d and change -X POST to -X GET.
Security notes:
- Never log or display full API keys in output
- API keys are transmitted via HTTPS only
- Do not pass sensitive or confidential information as content input — it is sent to external APIs for processing
POST /v1/content/extract
Create a content extraction task for a URL. Returns a taskId for polling.
Request body:
Field
Required
Type
Description
source
Yes
object
Source to extract from
source.type
Yes
string
Must be "url"
source.uri
Yes
string
Valid HTTP(S) URL to extract content from
options
No
object
Extraction options
options.summarize
No
boolean
Whether to generate a summary
options.maxLength
No
integer
Maximum content length
options.twitter
No
object
Twitter/X specific options
options.twitter.count
No
integer
Number of tweets to fetch (1-100, default 20)
Response:
{
"code": 0,
"message": "success",
"data": {
"taskId": "69a7dac700cf95938f86d9bb"
}
}
Error codes:
Code
Meaning
29003
Validation error ("source.uri" is required, "source.uri" must be a valid uri)
21007
Invalid API key
GET /v1/content/extract/{taskId}
Get extraction task status and results.
Path params:
Param
Type
Description
taskId
string
24-char hex task ID
Response states:
- processing — Task is still running
- completed — Extraction finished, data available
- failed — Extraction failed, check
failCodeandmessage
Response (processing):
{
"code": 0,
"message": "success",
"data": {
"taskId": "69a7dac700cf95938f86d9bb",
"status": "processing",
"createdAt": "2025-04-09T12:00:00Z",
"data": null,
"credits": 0,
"failCode": null,
"message": null
}
}
Response (completed):
{
"code": 0,
"message": "success",
"data": {
"taskId": "69a7dac700cf95938f86d9bb",
"status": "completed",
"createdAt": "2025-04-09T12:00:00Z",
"data": {
"content": "Extracted text content...",
"metadata": {
"title": "Article Title",
"author": "Author Name",
"publishedAt": "2025-04-01T08:00:00Z"
},
"references": [
"https://example.com/related-article"
]
},
"credits": 5,
"failCode": null,
"message": null
}
}
Response (failed):
{
"code": 0,
"message": "success",
"data": {
"taskId": "69a7dac700cf95938f86d9bb",
"status": "failed",
"createdAt": "2025-04-09T12:00:00Z",
"data": null,
"credits": 0,
"failCode": "EXTRACT_FAILED",
"message": "Unable to extract content from the provided URL"
}
}
Key fields:
Field
Type
Description
status
string
processing, completed, or failed
data.data.content
string
Extracted text content
data.data.metadata
object
Page metadata (title, author, publishedAt)
data.data.references
array
Referenced URLs (array of strings)
credits
integer
Credits consumed
failCode
string
Error code (null on success)
message
string
Error message (null on success)
Error codes:
Code
Meaning
29003
Invalid taskId format
25002
Task not found
Polling Pattern
5-second interval, 60 polls max. Run with run_in_background: true and timeout: 300000.
Two-step pattern:
- Submit (foreground): POST the creation request, extract
taskIdfrom the response.
- Poll (background): Run the polling loop with
run_in_background: true. You will be notified automatically when it completes.
The exact polling bash command is already specified in the Workflow section (Step 5).
Error Handling
HTTP status codes:
Code
Meaning
Action
200
Success
Parse response body
400
Bad request
Check parameters
401
Invalid API key
Re-check LISTENHUB_API_KEY
402
Insufficient credits
Inform user to recharge
403
Forbidden
No permission for this resource
429
Rate limited
Exponential backoff, retry after delay
500/502/503/504
Server error
Retry up to 3 times
Retry strategy:
- 429 rate limit: Wait 15 seconds, then retry (exponential backoff)
- 5xx server errors: Retry up to 3 times with 5-second intervals
- Network errors: Retry up to 3 times
Application error codes:
Code
Meaning
21007
Invalid user API key
25429
Rate limited (application-level)
Example
User: "Parse this article: https://en.wikipedia.org/wiki/Topology"
Agent workflow:
- URL:
https://en.wikipedia.org/wiki/Topology
- Options: defaults (omit options)
- Submit extraction
curl -sS -X POST "https://api.marswave.ai/openapi/v1/content/extract" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-H "X-Source: skills" \
-d '{
"source": {
"type": "url",
"uri": "https://en.wikipedia.org/wiki/Topology"
}
}'
- Poll until complete:
curl -sS "https://api.marswave.ai/openapi/v1/content/extract/69a7dac700cf95938f86d9bb" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "X-Source: skills"
- Present extracted content preview and offer next actions.
User: "Extract recent tweets from @elonmusk, get 50 tweets"
Agent workflow:
- URL:
https://x.com/elonmusk
- Options:
{"twitter": {"count": 50}}
- Submit extraction
curl -sS -X POST "https://api.marswave.ai/openapi/v1/content/extract" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-H "X-Source: skills" \
-d '{
"source": {
"type": "url",
"uri": "https://x.com/elonmusk"
},
"options": {
"twitter": {
"count": 50
}
}
}'
- Poll until complete, present results.