analyse-problem

Comprehensive A3 one-page problem analysis with root cause and action plan

INSTALLATION
npx skills add https://github.com/neolabhq/context-engineering-kit --skill analyse-problem
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

A3 Problem Analysis

Apply A3 problem-solving format for comprehensive, single-page problem documentation and resolution planning.

Description

Structured one-page analysis format covering: Background, Current Condition, Goal, Root Cause Analysis, Countermeasures, Implementation Plan, and Follow-up. Named after A3 paper size; emphasizes concise, complete documentation.

Usage

/analyse-problem [problem_description]

Variables

  • PROBLEM: Issue to analyze (default: prompt for input)
  • OUTPUT_FORMAT: markdown or text (default: markdown)

Steps

  • Background: Why this problem matters (context, business impact)
  • Current Condition: What's happening now (data, metrics, examples)
  • Goal/Target: What success looks like (specific, measurable)
  • Root Cause Analysis: Why problem exists (use 5 Whys or Fishbone)
  • Countermeasures: Proposed solutions addressing root causes
  • Implementation Plan: Who, what, when, how
  • Follow-up: How to verify success and prevent recurrence

A3 Template

═══════════════════════════════════════════════════════════════

                    A3 PROBLEM ANALYSIS

═══════════════════════════════════════════════════════════════

TITLE: [Concise problem statement]

OWNER: [Person responsible]

DATE: [YYYY-MM-DD]

┌─────────────────────────────────────────────────────────────┐

│ 1. BACKGROUND (Why this matters)                            │

├─────────────────────────────────────────────────────────────┤

│ [Context, impact, urgency, who's affected]                  │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 2. CURRENT CONDITION (What's happening)                     │

├─────────────────────────────────────────────────────────────┤

│ [Facts, data, metrics, examples - no opinions]              │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 3. GOAL/TARGET (What success looks like)                    │

├─────────────────────────────────────────────────────────────┤

│ [Specific, measurable, time-bound targets]                  │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 4. ROOT CAUSE ANALYSIS (Why problem exists)                 │

├─────────────────────────────────────────────────────────────┤

│ [5 Whys, Fishbone, data analysis]                           │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 5. COUNTERMEASURES (Solutions addressing root causes)       │

├─────────────────────────────────────────────────────────────┤

│ [Specific actions, not vague intentions]                    │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 6. IMPLEMENTATION PLAN (Who, What, When)                    │

├─────────────────────────────────────────────────────────────┤

│ [Timeline, responsibilities, dependencies, milestones]      │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 7. FOLLOW-UP (Verification & Prevention)                    │

├─────────────────────────────────────────────────────────────┤

│ [Success metrics, monitoring plan, review dates]            │

└─────────────────────────────────────────────────────────────┘

═══════════════════════════════════════════════════════════════

Examples

Example 1: Database Connection Pool Exhaustion

═══════════════════════════════════════════════════════════════

                    A3 PROBLEM ANALYSIS

═══════════════════════════════════════════════════════════════

TITLE: API Downtime Due to Connection Pool Exhaustion

OWNER: Backend Team Lead

DATE: 2024-11-14

┌─────────────────────────────────────────────────────────────┐

│ 1. BACKGROUND                                                │

├─────────────────────────────────────────────────────────────┤

│ • API goes down 2-3x per week during peak hours             │

│ • Affects 10,000+ users, average 15min downtime             │

│ • Revenue impact: ~$5K per incident                         │

│ • Customer satisfaction score dropped from 4.5 to 3.8       │

│ • Started 3 weeks ago after traffic increased 40%           │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 2. CURRENT CONDITION                                         │

├─────────────────────────────────────────────────────────────┤

│ Observations:                                                │

│ • Connection pool size: 10 (unchanged since launch)         │

│ • Peak concurrent users: 500 (was 300 three weeks ago)      │

│ • Average request time: 200ms (was 150ms)                   │

│ • Connections leaked: ~2 per hour (never released)          │

│ • Error: "Connection pool exhausted" in logs                │

│                                                              │

│ Pattern:                                                     │

│ • Occurs at 2pm-4pm daily (peak traffic)                    │

│ • Gradual degradation over 30 minutes                       │

│ • Recovery requires app restart                             │

│ • Long-running queries block pool (some 30+ seconds)        │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 3. GOAL/TARGET                                               │

├─────────────────────────────────────────────────────────────┤

│ • Zero downtime due to connection exhaustion                │

│ • Support 1000 concurrent users (2x current peak)           │

│ • All connections released within 5 seconds                 │

│ • Achieve within 1 week                                     │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 4. ROOT CAUSE ANALYSIS                                       │

├─────────────────────────────────────────────────────────────┤

│ 5 Whys:                                                      │

│ Problem: Connection pool exhausted                          │

│ Why 1: All 10 connections in use, none available            │

│ Why 2: Connections not released after requests              │

│ Why 3: Error handling doesn't close connections             │

│ Why 4: Try-catch blocks missing .finally()                  │

│ Why 5: No code review checklist for resource cleanup        │

│                                                              │

│ Contributing factors:                                        │

│ • Pool size too small for current load                      │

│ • No connection timeout configured (hangs forever)          │

│ • Slow queries hold connections longer                      │

│ • No monitoring/alerting on pool metrics                    │

│                                                              │

│ ROOT CAUSE: Systematic issue with resource cleanup +        │

│             insufficient pool sizing                         │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 5. COUNTERMEASURES                                           │

├─────────────────────────────────────────────────────────────┤

│ Immediate (This Week):                                       │

│ 1. Audit all DB code, add .finally() for connection release │

│ 2. Increase pool size: 10 → 30                              │

│ 3. Add connection timeout: 10 seconds                       │

│ 4. Add pool monitoring & alerts (>80% used)                 │

│                                                              │

│ Short-term (2 Weeks):                                        │

│ 5. Optimize slow queries (add indexes)                      │

│ 6. Implement connection pooling best practices doc          │

│ 7. Add automated test for connection leaks                  │

│                                                              │

│ Long-term (1 Month):                                         │

│ 8. Migrate to connection pool library with auto-release     │

│ 9. Add linter rule detecting missing .finally()             │

│ 10. Create PR checklist for resource management             │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 6. IMPLEMENTATION PLAN                                       │

├─────────────────────────────────────────────────────────────┤

│ Week 1 (Nov 14-18):                                          │

│ • Day 1-2: Audit & fix connection leaks [Dev Team]          │

│ • Day 2: Increase pool size, add timeout [DevOps]           │

│ • Day 3: Set up monitoring [SRE]                            │

│ • Day 4: Test under load [QA]                               │

│ • Day 5: Deploy to production [DevOps]                      │

│                                                              │

│ Week 2 (Nov 21-25):                                          │

│ • Optimize identified slow queries [DB Team]                │

│ • Write best practices doc [Tech Writer + Dev Lead]         │

│ • Create connection leak test [QA Team]                     │

│                                                              │

│ Week 3-4 (Nov 28 - Dec 9):                                   │

│ • Evaluate connection pool libraries [Dev Team]             │

│ • Add linter rules [Dev Lead]                               │

│ • Update PR template [Dev Lead]                             │

│                                                              │

│ Dependencies: None blocking Week 1 fixes                     │

│ Resources: 2 developers, 1 DevOps, 1 SRE                    │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 7. FOLLOW-UP                                                 │

├─────────────────────────────────────────────────────────────┤

│ Success Metrics:                                             │

│ • Zero downtime incidents (monitor 4 weeks)                 │

│ • Pool usage stays <80% during peak                         │

│ • No connection leaks detected                              │

│ • Response time <200ms p95                                  │

│                                                              │

│ Monitoring:                                                  │

│ • Daily: Check pool usage dashboard                         │

│ • Weekly: Review connection leak alerts                     │

│ • Bi-weekly: Team retrospective on progress                 │

│                                                              │

│ Review Dates:                                                │

│ • Week 1 (Nov 18): Verify immediate fixes effective         │

│ • Week 2 (Nov 25): Assess optimization impact               │

│ • Week 4 (Dec 9): Final review, close A3                    │

│                                                              │

│ Prevention:                                                  │

│ • Add connection handling to onboarding                     │

│ • Monthly audit of resource management code                 │

│ • Include pool metrics in SRE runbook                       │

└─────────────────────────────────────────────────────────────┘

═══════════════════════════════════════════════════════════════

Example 2: Security Vulnerability in Production

═══════════════════════════════════════════════════════════════

                    A3 PROBLEM ANALYSIS

═══════════════════════════════════════════════════════════════

TITLE: Critical SQL Injection Vulnerability

OWNER: Security Team Lead

DATE: 2024-11-14

┌─────────────────────────────────────────────────────────────┐

│ 1. BACKGROUND                                                │

├─────────────────────────────────────────────────────────────┤

│ • Critical security vulnerability reported by researcher    │

│ • SQL injection in user search endpoint                     │

│ • Potential data breach affecting 100K+ user records        │

│ • CVSS score: 9.8 (Critical)                                │

│ • Vulnerability exists in production for 6 months           │

│ • Similar issue found in 2 other endpoints (scanning)       │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 2. CURRENT CONDITION                                         │

├─────────────────────────────────────────────────────────────┤

│ Vulnerable Code:                                             │

│ • /api/users/search endpoint uses string concatenation      │

│ • Input: search query (user-provided, not sanitized)        │

│ • Pattern: `SELECT * FROM users WHERE name = '${input}'`    │

│                                                              │

│ Scope:                                                       │

│ • 3 endpoints vulnerable (search, filter, export)           │

│ • All use same unsafe pattern                               │

│ • No parameterized queries                                  │

│ • No input validation layer                                 │

│                                                              │

│ Risk Assessment:                                             │

│ • Exploitable from public internet                          │

│ • No evidence of exploitation (logs checked)                │

│ • Similar code in admin panel (higher privilege)            │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 3. GOAL/TARGET                                               │

├─────────────────────────────────────────────────────────────┤

│ • Patch all SQL injection vulnerabilities within 24 hours   │

│ • Zero SQL injection vulnerabilities in codebase            │

│ • Prevent similar issues in future code                     │

│ • Verify no unauthorized access occurred                    │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 4. ROOT CAUSE ANALYSIS                                       │

├─────────────────────────────────────────────────────────────┤

│ 5 Whys:                                                      │

│ Problem: SQL injection vulnerability in production          │

│ Why 1: User input concatenated directly into SQL            │

│ Why 2: Developer wasn't aware of SQL injection risks        │

│ Why 3: No security training for new developers              │

│ Why 4: Security not part of onboarding checklist            │

│ Why 5: Security team not involved in development process    │

│                                                              │

│ Contributing Factors (Fishbone):                             │

│ • Process: No security code review                          │

│ • Technology: ORM not used consistently                     │

│ • People: Knowledge gap in secure coding                    │

│ • Methods: No SAST tools in CI/CD                           │

│                                                              │

│ ROOT CAUSE: Security not integrated into development        │

│             process, training gap                            │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 5. COUNTERMEASURES                                           │

├─────────────────────────────────────────────────────────────┤

│ Immediate (24 Hours):                                        │

│ 1. Patch all 3 vulnerable endpoints                         │

│ 2. Deploy hotfix to production                              │

│ 3. Scan codebase for similar patterns                       │

│ 4. Review access logs for exploitation attempts             │

│                                                              │

│ Short-term (1 Week):                                         │

│ 5. Replace all raw SQL with parameterized queries           │

│ 6. Add input validation middleware                          │

│ 7. Set up SAST tool in CI (Snyk/SonarQube)                  │

│ 8. Security team review of all data access code             │

│                                                              │

│ Long-term (1 Month):                                         │

│ 9. Mandatory security training for all developers           │

│ 10. Add security review to PR process                       │

│ 11. Migrate to ORM for all database access                  │

│ 12. Implement security champion program                     │

│ 13. Quarterly security audits                               │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 6. IMPLEMENTATION PLAN                                       │

├─────────────────────────────────────────────────────────────┤

│ Hour 0-4 (Emergency Response):                               │

│ • Write &#x26; test patches [Security + Senior Dev]              │

│ • Emergency PR review [CTO + Tech Lead]                     │

│ • Deploy to staging [DevOps]                                │

│                                                              │

│ Hour 4-24 (Production Deploy):                               │

│ • Deploy hotfix [DevOps + On-call]                          │

│ • Monitor for issues [SRE Team]                             │

│ • Scan logs for exploitation [Security Team]                │

│ • Notify stakeholders [Security Lead + CEO]                 │

│                                                              │

│ Day 2-7:                                                     │

│ • Full codebase remediation [Dev Team]                      │

│ • SAST tool setup [DevOps + Security]                       │

│ • Security review [External Auditor]                        │

│                                                              │

│ Week 2-4:                                                    │

│ • Security training program [Security + HR]                 │

│ • Process improvements [Engineering Leadership]             │

│                                                              │

│ Dependencies: External auditor availability (Week 2)         │

└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐

│ 7. FOLLOW-UP                                                 │

├─────────────────────────────────────────────────────────────┤

│ Success Metrics:                                             │

│ • Zero SQL injection vulnerabilities (verified by scan)     │

│ • 100% of PRs pass SAST checks                              │

│ • 100% developer security training completion               │

│ • No unauthorized access detected in log analysis           │

│                                                              │

│ Verification:                                                │

│ • Day 1: Verify patch deployed, vulnerability closed        │

│ • Week 1: External security audit confirms fixes            │

│ • Week 2: SAST tool catching similar issues                 │

│ • Month 1: Training completion, process adoption            │

│                                                              │

│ Prevention:                                                  │

│ • SAST tools block vulnerable code in CI                    │

│ • Security review required for data access code             │

│ • Quarterly penetration testing                             │

│ • Annual security training refresh                          │

│                                                              │

│ Incident Report:                                             │

│ • Post-mortem meeting: Nov 16                               │

│ • Document lessons learned                                  │

│ • Share with engineering org                                │

└─────────────────────────────────────────────────────────────┘

═══════════════════════════════════════════════════════════════

Notes

  • A3 forces concise, complete thinking (fits on one page)
  • Use data and facts, not opinions or blame
  • Root cause analysis is critical—use /why or /cause-and-effect
  • Countermeasures must address root causes, not symptoms
  • Implementation plan needs clear ownership and timelines
  • Follow-up ensures sustainable improvement
  • A3 becomes historical record for organizational learning
  • Update A3 as situation evolves (living document until closed)
  • Consider A3 for: incidents, recurring issues, major improvements
  • Overkill for: small bugs, one-line fixes, trivial issues
BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card