SKILL.md
$2c
- Current-request override — if the user names a specific backend in the current message, use it.
- Saved preference — if
EXTEND.mdsetspreferred_image_backendto a backend available right now, use it.
- Auto-select (when the preference is
auto, unset, or the pinned backend isn't available):
- **Codex (
imagegen)** — first, inspect your available-skills / tool inventory. If a skill namedimagegenis listed, you are running inside Codex and MUST use it: invoke via theSkilltool withskill: "imagegen", passing the saved prompt file's content (plus output path and aspect ratio per Codeximagegen's own args). Codeximagegenis the official raster backend in that runtime and outranks any non-native skill (e.g.,baoyu-imagine) unless the user has explicitly pinned a differentpreferred_image_backend.
- Other runtime-native tools — if the runtime exposes a different native image tool (e.g., Hermes
image_generate), use it the same way.
- Otherwise, if exactly one non-native backend is installed (e.g.,
baoyu-imagine), use it.
- Otherwise (multiple non-native backends with no runtime-native tool), ask the user once — batch with any other initial questions.
- If none are available, tell the user and ask how to proceed.
⛔ Never substitute SVG, HTML, canvas, or other code-based rendering for raster image generation. Codex imagegen's own description says it should be used "when the output should be a bitmap asset rather than repo-native code or vector." If you cannot resolve a raster backend via step 3, fall through to step 4 and ask the user — do not silently emit SVG, write inline <svg> markup, or produce HTML/CSS art as a substitute. This applies even if the article/section seems "diagram-like": the consumer skill calling this rule has already decided that a raster image is what it needs.
⛔ Never repair rendered text by painting over a generated bitmap. Do not use ImageMagick, Pillow, Canvas, SVG, HTML/CSS, OCR scripts, or any other programmatic overlay to cover, rewrite, erase, stroke, or replace title/subtitle text inside an already generated cover image. If text is wrong or unclear, regenerate from a corrected prompt, switch to a lower-text or no-title variant, or ask the user which imperfect candidate to keep.
Setting preferred_image_backend: ask forces the step-3 prompt every run regardless of available backends. Users change the pinned backend via the ## Changing Preferences section below.
Prompt file requirement (hard): write each image's full, final prompt to a standalone file under prompts/ (naming: NN-{type}-[slug].md) BEFORE invoking any backend. The backend receives the prompt file (or its content); the file is the reproducibility record and lets you switch backends without regenerating prompts.
Concrete tool names (imagegen, image_generate, baoyu-imagine) above are examples — substitute the local equivalents under the same rule.
Confirmation Policy
Default behavior: confirm before generation.
- Treat explicit skill invocation, a file path, matched keywords/presets,
EXTEND.mddefaults, and any documented auto-selection as recommendation inputs only. None of them authorizes skipping confirmation.
- Do not start Step 3 or Step 4 until the user confirms the dimensions / aspect / language / backend choices.
- Skip confirmation only when the current request explicitly says to do so, for example:
--quick, "直接生成", "不用确认", "跳过确认", "按默认出图", or equivalent wording.quick_mode: trueinEXTEND.mdcounts as a standing explicit opt-out — set it only when you want every run to skip Step 2.
- If confirmation is skipped explicitly, state the assumed dimensions / aspect / language / backend in the next user-facing update before generating.
Options
Option
Description
--type <name>
hero, conceptual, typography, metaphor, scene, minimal
--palette <name>
warm, elegant, cool, dark, earth, vivid, pastel, mono, retro, duotone, macaron
--rendering <name>
flat-vector, hand-drawn, painterly, digital, pixel, chalk, screen-print
--style <name>
Preset shorthand (see Style Presets)
--text <level>
none, title-only, title-subtitle, text-rich
--mood <level>
subtle, balanced, bold
--font <name>
clean, handwritten, serif, display
--aspect <ratio>
16:9 (default), 2.35:1, 4:3, 3:2, 1:1, 3:4
--lang <code>
Title language (en, zh, ja, etc.)
--no-title
Alias for --text none
--quick
Skip confirmation, use auto-selection
--ref <files...>
Reference images for style/composition guidance
Five Dimensions
Dimension
Values
Default
Type
hero, conceptual, typography, metaphor, scene, minimal
auto
Palette
warm, elegant, cool, dark, earth, vivid, pastel, mono, retro, duotone, macaron
auto
Rendering
flat-vector, hand-drawn, painterly, digital, pixel, chalk, screen-print
auto
Text
none, title-only, title-subtitle, text-rich
title-only
Mood
subtle, balanced, bold
balanced
Font
clean, handwritten, serif, display
clean
Auto-selection rules: references/auto-selection.md
Galleries
Types: hero, conceptual, typography, metaphor, scene, minimal
→ Details: references/types.md
Palettes: warm, elegant, cool, dark, earth, vivid, pastel, mono, retro, duotone, macaron
→ Details: references/palettes/
Renderings: flat-vector, hand-drawn, painterly, digital, pixel, chalk, screen-print
→ Details: references/renderings/
Text Levels: none (pure visual) | title-only (default) | title-subtitle | text-rich (with tags)
→ Details: references/dimensions/text.md
Mood Levels: subtle (low contrast) | balanced (default) | bold (high contrast)
→ Details: references/dimensions/mood.md
Fonts: clean (sans-serif) | handwritten | serif | display (bold decorative)
→ Details: references/dimensions/font.md
File Structure
Output directory per default_output_dir preference:
same-dir:{article-dir}/
imgs-subdir:{article-dir}/imgs/
independent(default):cover-image/{topic-slug}/
<output-dir>/
├── source-{slug}.{ext} # Source files
├── refs/ # Reference images (if provided)
│ ├── ref-01-{slug}.{ext}
│ └── ref-01-{slug}.md # Description file
├── prompts/cover.md # Generation prompt
└── cover.png # Output image
Slug: 2-4 words, kebab-case. Conflict: append -YYYYMMDD-HHMMSS
Workflow
Progress Checklist
Cover Image Progress:
- [ ] Step 0: Check preferences (EXTEND.md) ⛔ BLOCKING
- [ ] Step 1: Analyze content + save refs + determine output dir
- [ ] Step 2: Confirm options (6 dimensions) ⚠️ unless --quick
- [ ] Step 3: Create prompt
- [ ] Step 4: Generate image
- [ ] Step 5: Completion report
Flow
Input → [Step 0: Preferences] ─┬─ Found → Continue
└─ Not found → First-Time Setup ⛔ BLOCKING → Save EXTEND.md → Continue
↓
Analyze + Save Refs → [Output Dir] → [Confirm: 6 Dimensions] → Prompt → Generate → Complete
↓
(skip if --quick or all specified)
Step 0: Load Preferences ⛔ BLOCKING
Check EXTEND.md in priority order — the first one found wins:
Priority
Path
Scope
1
.baoyu-skills/baoyu-cover-image/EXTEND.md
Project
2
${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-cover-image/EXTEND.md
XDG
3
$HOME/.baoyu-skills/baoyu-cover-image/EXTEND.md
User home
Result
Action
Found
Load, display summary → Continue
Not found
⛔ Run first-time setup (references/config/first-time-setup.md) → Save → Continue
CRITICAL: If not found, complete setup BEFORE any other steps or questions.
Step 1: Analyze Content
- Save reference images (if provided) → references/workflow/reference-images.md
- Save source content (if pasted, save to
source.md)
- Analyze content: topic, tone, keywords, visual metaphors
- Deep analyze references ⚠️: Extract specific, concrete elements (see reference-images.md)
- Detect language: Compare source, user input, EXTEND.md preference
- Determine output directory: Per File Structure rules
⚠️ People in Reference Images:
If reference images contain people who should appear in the cover:
- **Model supports
--ref** (default): Copy image torefs/, pass via--refat generation. No description file needed — the model sees the face directly.
- **Model does NOT support
--ref** (Jimeng, Seedream 3.0): Createrefs/ref-NN-{slug}.mdwith per-character description (hair, glasses, skin tone, clothing). Embed as MUST/REQUIRED instructions in prompt text.
See reference-images.md for full decision table.
Step 2: Confirm Options ⚠️
Hard gate: this step is mandatory per the [Confirmation Policy](#confirmation-policy) — Steps 3–4 cannot start until the user confirms here (or explicitly opts out with --quick / quick_mode: true / equivalent wording in the current request).
**MUST use AskUserQuestion tool** to present options as interactive selection — NOT plain text tables. Present up to 4 questions in a single AskUserQuestion call (Type, Palette, Rendering, Font + Settings). Each question shows the recommended option first with reason, followed by alternatives.
Full confirmation flow and question format: references/workflow/confirm-options.md
Condition
Skipped
Still Asked
--quick or quick_mode: true
6 dimensions
Aspect ratio (unless --aspect)
All 6 + --aspect specified
All
None
Step 3: Create Prompt
Save to prompts/cover.md. Template: references/workflow/prompt-template.md
CRITICAL - References in Frontmatter:
- Files saved to
refs/→ Add to frontmatterreferenceslist
- Style extracted verbally (no file) → Omit
references, describe in body
- Before writing → Verify:
test -f refs/ref-NN-{slug}.{ext}
Reference elements in body MUST be detailed, prefixed with "MUST"/"REQUIRED", with integration approach.
Step 4: Generate Image
- Backup existing
cover.pngif regenerating
- Select backend via the
## Image Generation Toolsrule at the top: use whatever is available; if multiple, ask the user once. Do this once per session before any generation.
- Write the full final prompt to
prompts/01-cover-[slug].md(hard requirement) BEFORE invoking the backend.
- Process references from prompt frontmatter:
directusage → pass via--ref(use ref-capable backend)
style/palette→ extract traits, append to prompt
- Generate: Call the chosen backend with the prompt file, output path, aspect ratio
- On failure: auto-retry once
Step 5: Completion Report
Cover Generated!
Topic: [topic]
Type: [type] | Palette: [palette] | Rendering: [rendering]
Text: [text] | Mood: [mood] | Font: [font] | Aspect: [ratio]
Title: [title or "visual only"]
Language: [lang] | Watermark: [enabled/disabled]
References: [N images or "extracted style" or "none"]
Location: [directory path]
Files:
✓ source-{slug}.{ext}
✓ prompts/cover.md
✓ cover.png
Image Modification
Action
Steps
Regenerate
Backup → Update prompt file FIRST → Regenerate
Change dimension
Backup → Confirm new value → Update prompt → Regenerate
Text correction policy:
- If the title/subtitle is misspelled, garbled, hard to read, or visually weak, do not patch the bitmap with code.
- For text-correction regenerations, write a new prompt file and a new output path so the flawed candidate is preserved for comparison.
- Post-processing is limited to crop, resize, compression, or format conversion that does not alter text or the main composition.
Composition Principles
- Whitespace: 40-60% breathing room
- Visual anchor: Main element centered or offset left
- Characters: Simplified silhouettes; NO realistic humans
- Title: Use exact title from user/source; never invent
Changing Preferences
EXTEND.md lives at the path noted in Step 0. Three ways to change it:
- Edit directly — open EXTEND.md and change fields. Full schema: references/config/preferences-schema.md.
- Reconfigure interactively — delete EXTEND.md (or ask "reconfigure baoyu-cover-image preferences" / "重新配置"). The next run re-triggers first-time setup.
- Common one-line edits:
preferred_image_backend: auto— default; runtime-native tool wins, falls back to the only installed backend, asks only if multiple non-native are present.
preferred_image_backend: codex-imagegen— pin to Codex's built-in.
preferred_image_backend: baoyu-imagine— pin to the baoyu-imagine skill.
preferred_image_backend: ask— confirm backend every run.
watermark.enabled: true,preferred_type,preferred_palette,preferred_rendering,default_aspect,quick_mode: true,language— shift the auto-selection defaults and confirmation flow.
References
Dimensions: text.md | mood.md | font.md
Palettes: references/palettes/
Renderings: references/renderings/
Types: references/types.md
Auto-Selection: references/auto-selection.md
Style Presets: references/style-presets.md
Compatibility: references/compatibility.md
Visual Elements: references/visual-elements.md
Workflow: confirm-options.md | prompt-template.md | reference-images.md
Config: preferences-schema.md | first-time-setup.md | watermark-guide.md