citation-management

Manage BibTeX citations for LaTeX papers. Harvest missing citations from a draft using Semantic Scholar, validate cite keys against .bib files, deduplicate…

INSTALLATION
npx skills add https://github.com/lingzhi227/agent-research-skills --skill citation-management
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

Citation Management

Manage the full lifecycle of citations in a LaTeX paper.

Input

  • $0 — Action: harvest, validate, add, format
  • $1 — Path to .tex or .bib file

Scripts

Validate citations (check all cite keys resolve)

python ~/.claude/skills/citation-management/scripts/validate_citations.py \

  --tex paper/main.tex --bib paper/references.bib --check-figures --figures-dir paper/figures/

Reports: missing citations, unused bib entries, duplicate keys, duplicate sections, duplicate labels, undefined references, missing figures.

Generate BibTeX from paper database

python ~/.claude/skills/deep-research/scripts/bibtex_manager.py \

  --jsonl paper_db.jsonl --output references.bib

Search for a specific paper to add

python ~/.claude/skills/deep-research/scripts/search_semantic_scholar.py \

  --query "attention is all you need" --max-results 5 \

  --api-key "$(grep S2_API_Key /Users/lingzhi/Code/keys.md 2>/dev/null | cut -d: -f2 | tr -d ' ')"

Harvest missing citations automatically

python ~/.claude/skills/citation-management/scripts/harvest_citations.py \

  --tex paper/main.tex --bib paper/references.bib --output candidates.bib --max-rounds 10

Scans .tex for uncited claims, searches Semantic Scholar, outputs candidate BibTeX entries.

Key flags: --dry-run (preview only), --verbose, --api-key

Auto-fix missing citation placeholders

python ~/.claude/skills/citation-management/scripts/validate_citations.py \

  --tex paper/main.tex --bib paper/references.bib --fix

Generates references_fixed.bib with placeholder entries for all missing citation keys.

Action: harvest — Iterative Citation Harvesting

Based on AI-Scientist's 20-round citation harvesting loop. For each round:

  • Read the current .tex draft
  • Identify the most important missing citation
  • Search Semantic Scholar via script
  • Select the most relevant paper from results
  • Extract BibTeX and generate a clean key (lastNameYearWord)
  • Append to .bib (skip if key exists)
  • Insert \cite{key} at the appropriate location
  • Stop when no more gaps or 20 rounds reached

Key rules:

  • DO NOT add a citation that already exists
  • Only add citations found via API — never fabricate
  • Cite broadly — not just popular papers
  • Do not copy verbatim from prior literature

Action: validate — Pre-Compilation Check

Run validate_citations.py to catch all issues before compilation. Fix any reported problems.

Action: add — Add Specific Paper

Search Semantic Scholar for the paper, extract BibTeX, clean the key, append to .bib.

BibTeX key format: firstAuthorLastNameYearFirstContentWord (e.g., vaswani2017attention)

Action: format — Standardize .bib

  • Sort entries alphabetically by key
  • Ensure consistent indentation (2 spaces)
  • Remove empty fields
  • Protect proper nouns with {Braces} in titles
  • Ensure required fields per entry type

Related Skills

BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card