paper-audit

Reviewer-style audit and submission gate for Chinese and English academic papers across LaTeX, Typst, and PDF formats. Use whenever the user wants peer-review…

INSTALLATION
npx skills add https://github.com/bahayonghang/academic-writing-skills --skill paper-audit
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

Paper Audit Skill v4.5

paper-audit is deep-review-first. Its core job is to behave like a

serious reviewer: find technical, methodological, claim-level, and

cross-section issues; keep script-backed findings separate from reviewer

judgment; and return a structured issue bundle plus a revision roadmap.

Version 4.5 adds a script-backed PRESUBMISSION layer for final-week

mechanical checks (em dashes, AI-tone term frequency, abstract completeness,

LaTeX citation/label/equation hygiene, paragraph-shape weak signals, concrete

captions). It plugs into existing modes; it is not a separate public mode.

See references/PRESUBMISSION_GUIDE.md for mode integration.

Use it for audit and review. Do not use it as the first tool for source

editing, sentence rewriting, or build fixing.

What This Skill Produces

  • quick-audit: fast submission-readiness screen with script-backed findings,

including PRESUBMISSION

  • deep-review: reviewer-style structured issue bundle with major/moderate/

minor findings

  • gate: PASS/FAIL decision calibrated for submission blockers;

PRESUBMISSION Major/Minor findings remain advisory

  • re-audit: compare current issue bundle against a previous audit, including

mechanical regression findings

  • polish: precheck-only handoff into a polishing workflow

The primary product is no longer just a score. For deep-review, the main

outputs are:

  • final_issues.json
  • overall_assessment.txt
  • review_report.md
  • peer_review_report.md
  • revision_roadmap.md

Do Not Use

  • direct source surgery on .tex / .typ
  • compilation debugging as the main task
  • free-form literature survey writing
  • paragraph-level related-work rewriting
  • cosmetic grammar cleanup without an audit goal

Critical Rules

  • Don't rewrite the paper source — paper-audit is a reviewer, not an editor; switch skills explicitly if the user wants prose changes, so review evidence stays separable from edits.
  • Don't fabricate references, baselines, or reviewer evidence — invented citations and made-up reviewer voices undermine every other finding in the bundle.
  • Distinguish [Script] from [LLM] findings — script-backed items have a deterministic anchor the user can rerun, while LLM findings need a quote or section to be falsifiable.
  • Anchor every reviewer finding to a quote, section, or exact textual location — unanchored complaints become impossible to audit on a re-pass.
  • Be conservative with OCR noise, formatting quirks, and copy-editing trivia — flagging cosmetic noise inflates the report and buries the real issues.
  • Read like a careful reader before flagging — understand the author's intended meaning first so the issue captures a real misread, not a strawman.
  • For literature findings, judge whether the gap is evidence-backed and fairly positioned, and don't rewrite the prose inside paper-audit — keep prose rewrites in the format-specific writing skills where they can be reviewed in isolation.
  • For PRESUBMISSION, map CRITICAL / MAJOR / MINOR to Critical / Major / Minor script severities; only Critical or failed checklist items can fail gate — otherwise mechanical findings drown out the substantive ones.

Full mode-integration matrix lives in references/PRESUBMISSION_GUIDE.md.

  • In PDF mode, do not guess source-only hygiene. Report text-proven items

and note that LaTeX/Typst source checks were skipped.

Mode Selection

Requested intent

Mode

"check my paper", "quick audit", "submission readiness", "pre-submission review", "投稿前检查"

quick-audit

"review my paper", "simulate peer review", "harsh review", "deep review"

deep-review

"is this ready to submit", "gate this submission", "blockers only"

gate

"did I fix these issues", "re-audit", "compare against old review"

re-audit

"polish the writing, but only if safe"

polish

Legacy aliases still work for one compatibility cycle:

  • self-check -> quick-audit
  • review -> deep-review

For per-mode workflow steps, input resolution rules, presentation surface

rules, and committee focus routing, see references/MODE_GUIDE.md.

Review Standard

Read these references before running reviewer-style work:

  • references/REVIEW_CRITERIA.md
  • references/DEEP_REVIEW_CRITERIA.md
  • references/CHECKLIST.md
  • references/CONSOLIDATION_RULES.md
  • references/ISSUE_SCHEMA.md
  • references/PRE_SUBMISSION_RULES.md
  • references/PRESUBMISSION_GUIDE.md
  • references/MODE_GUIDE.md

The deep-review workflow uses a 16-part issue taxonomy:

  • formula / derivation errors
  • notation inconsistency
  • prose vs formal object mismatch
  • numerical inconsistency
  • missing justification
  • overclaim or claim inaccuracy
  • ambiguity that can mislead a careful reader
  • underspecified methods / missing information
  • internal contradiction
  • self-consistency of standards
  • table structure violations
  • abstract structural incompleteness
  • theory contribution deficiency
  • qualitative methodology opacity
  • pseudo-innovation / straw man
  • paragraph-level argument incoherence

Workflow

Each mode has the same shape: parse $ARGUMENTS, lock the paper path, infer

mode/report-style/focus/language if not provided, then run the canonical

command. Detailed phase steps are in references/MODE_GUIDE.md.

quick-audit

uv run python -B "$SKILL_DIR/scripts/audit.py" <paper> --mode quick-audit ...

Present Submission Blockers -> Quality Improvements -> checklist; call

out PRESUBMISSION mechanical findings with [Script] provenance. Escalate

to deep-review when the user wants reviewer-depth critique.

deep-review

Five phases (see references/MODE_GUIDE.md for full detail):

  • Workspace prep:
uv run python -B "$SKILL_DIR/scripts/prepare_review_workspace.py" <paper> --output-dir ./review_results
  • Phase 0 automated audit:
uv run python -B "$SKILL_DIR/scripts/audit.py" <paper> --mode deep-review ...
  • Phase 3A committee — dispatch 5 committee agents (editor, theory,

literature, methodology, logic) and write committee/consensus.md.

  • Phase 3B section + cross-cutting lanes — section, claims-vs-evidence,

notation, evaluation fairness, self-consistency, prior-art, and

pre-submission readiness (full/editor focus only).

  • Consolidation:
uv run python -B "$SKILL_DIR/scripts/consolidate_review_findings.py" <review_dir>

uv run python -B "$SKILL_DIR/scripts/verify_quotes.py" <review_dir> --write-back

uv run python -B "$SKILL_DIR/scripts/render_deep_review_report.py" <review_dir>

When the user explicitly asks for journal-review prose, set

--report-style peer-review so peer_review_report.md becomes the **Primary

View** while review_report.md stays as the richer evidence bundle.

gate

uv run python -B "$SKILL_DIR/scripts/audit.py" <paper> --mode gate ...

Run EIC Screening (Phase 0.5) using agents/editor_in_chief_agent.md

first; report PASS/FAIL; verdict -> EIC -> blockers -> advisory. A desk-reject

verdict is a gate blocker. Critical PRESUBMISSION only blocks the gate.

re-audit

Requires --previous-report PATH.

uv run python -B "$SKILL_DIR/scripts/audit.py" <paper> --mode re-audit --previous-report <path> ...

uv run python -B "$SKILL_DIR/scripts/diff_review_issues.py" <old_final_issues.json> <new_final_issues.json>

Present root-cause-aware status labels: FULLY_ADDRESSED,

PARTIALLY_ADDRESSED, NOT_ADDRESSED, NEW.

polish

uv run python -B "$SKILL_DIR/scripts/audit.py" <paper> --mode polish ...

If blockers exist, stop and report them. Only proceed into polishing if the

precheck is safe.

Output Contract

For deep-review, the final issue schema is:

{

  "title": "short issue title",

  "quote": "exact quote from paper",

  "explanation": "why this matters and what remains problematic",

  "comment_type": "methodology|claim_accuracy|presentation|missing_information",

  "severity": "major|moderate|minor",

  "confidence": "high|medium|low|unverified",

  "source_kind": "script|llm",

  "source_section": "methods",

  "related_sections": ["results", "appendix"],

  "root_cause_key": "shared-normalized-key",

  "review_lane": "claims_vs_evidence",

  "gate_blocker": false,

  "quote_verified": true

}

Always prefer:

  • exact quotes over vague paraphrase
  • evidence-backed findings over style commentary
  • issue bundle + roadmap over raw script dumps

References

File

Purpose

references/MODE_GUIDE.md

per-mode workflow detail, phase steps, committee focus routing

references/PRESUBMISSION_GUIDE.md

PRESUBMISSION mode-integration behavior matrix

references/REVIEW_CRITERIA.md

top-level audit scoring and mapping

references/DEEP_REVIEW_CRITERIA.md

deep-review-specific issue taxonomy and leniency rules

references/CONSOLIDATION_RULES.md

deduplication and root-cause merge policy

references/ISSUE_SCHEMA.md

canonical JSON schema

references/REVIEW_LANE_GUIDE.md

section lanes and cross-cutting lanes

references/PRE_SUBMISSION_RULES.md

final-week mechanical audit rules and term list

references/SUBAGENT_TEMPLATES.md

reviewer task templates

references/QUICK_REFERENCE.md

CLI and mode cheat sheet

Scripts

Script

Purpose

scripts/audit.py

Phase 0 audit and mode entrypoint

scripts/pre_submission_check.py

deterministic PRESUBMISSION mechanical audit layer

scripts/prepare_review_workspace.py

create deep-review workspace

scripts/build_claim_map.py

extract headline claims and closure targets

scripts/consolidate_review_findings.py

deduplicate comment JSONs

scripts/verify_quotes.py

verify exact quote presence

scripts/render_deep_review_report.py

render final Markdown report

scripts/diff_review_issues.py

compare old vs new issue bundles

Reviewer Lanes

Committee agents (deep-review default):

  • committee_editor_agent.md
  • committee_theory_agent.md
  • committee_literature_agent.md
  • committee_methodology_agent.md
  • committee_logic_agent.md

Default deep-review lanes live in agents/:

  • section_reviewer_agent.md
  • claims_evidence_reviewer_agent.md
  • notation_consistency_reviewer_agent.md
  • evaluation_fairness_reviewer_agent.md
  • self_consistency_reviewer_agent.md
  • prior_art_reviewer_agent.md
  • synthesis_agent.md
  • editor_in_chief_agent.md — EIC desk-reject screener (used in gate mode)

Specialized deep-review agents (read their files for activation criteria):

  • critical_reviewer_agent.md — devil's advocate with C3-C5 checks
  • domain_reviewer_agent.md — domain expertise with A1-A7 assessments
  • methodology_reviewer_agent.md — methodology rigor with B3-B10 checks
  • literature_reviewer_agent.md — evidence-based literature verification

(optional, --literature-search)

Examples

  • "Review this manuscript like a serious conference reviewer and tell me the

biggest validity risks."

  • "Run a quick audit on paper.tex and tell me what blocks submission."
  • "Gate this IEEE submission and separate blockers from recommendations."
  • "Re-audit this revision against my previous report."
  • "Audit only the literature positioning and tell me whether the claimed gap

is real or fabricated by selective citation."

BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card