golang-performance

Golang performance optimization patterns and methodology - if X bottleneck, then apply Y. Covers allocation reduction, CPU efficiency, memory layout, GC…

INSTALLATION
npx skills add https://github.com/samber/cc-skills-golang --skill golang-performance
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

$28

Before optimizing Go code, verify the bottleneck is in your process — if 90% of latency is a slow DB query or API call, reducing allocations won't help.

Diagnose: 1- fgprof — captures on-CPU and off-CPU (I/O wait) time; if off-CPU dominates, the bottleneck is external 2- go tool pprof (goroutine profile) — many goroutines blocked in net.(*conn).Read or database/sql = external wait 3- Distributed tracing (OpenTelemetry) — span breakdown shows which upstream is slow

When external: optimize that component instead — query tuning, caching, connection pools, circuit breakers (→ See samber/cc-skills-golang@golang-database skill, Caching Patterns).

Iterative Optimization Methodology

The cycle: Define Goals → Benchmark → Diagnose → Improve → Benchmark

  • Define your metric — latency, throughput, memory, or CPU? Without a target, optimizations are random
  • Write an atomic benchmark — isolate one function per benchmark to avoid result contamination (→ See samber/cc-skills-golang@golang-benchmark skill)
  • Measure baselinego test -bench=BenchmarkMyFunc -benchmem -count=6 ./pkg/... | tee /tmp/report-1.txt
  • Diagnose — use the Diagnose lines in each deep-dive section to pick the right tool
  • Improve — apply ONE optimization at a time with an explanatory comment
  • Comparebenchstat /tmp/report-1.txt /tmp/report-2.txt to confirm statistical significance
  • Commit — paste the benchstat output in the commit body so reviewers and future readers see the exact improvement; follow the perf(scope): summary commit type
  • Repeat — increment report number, tackle next bottleneck

Refer to library documentation for known patterns before inventing custom solutions. Keep all /tmp/report-*.txt files as an audit trail.

Decision Tree: Where Is Time Spent?

Bottleneck

Signal (from pprof)

Action

Too many allocations

alloc_objects high in heap profile

Memory optimization

CPU-bound hot loop

function dominates CPU profile

CPU optimization

GC pauses / OOM

high GC%, container limits

Runtime tuning

Network / I/O latency

goroutines blocked on I/O

I/O & networking

Repeated expensive work

same computation/fetch multiple times

Caching patterns

Wrong algorithm

O(n²) where O(n) exists

Algorithmic complexity

Lock contention

mutex/block profile hot

→ See samber/cc-skills-golang@golang-concurrency skill

Slow queries

DB time dominates traces

→ See samber/cc-skills-golang@golang-database skill

Common Mistakes

Mistake

Fix

Optimizing without profiling

Profile with pprof first — intuition is wrong ~80% of the time

Default http.Client without Transport

MaxIdleConnsPerHost defaults to 2; set to match your concurrency level

Logging in hot loops

Log calls prevent inlining and allocate even when the level is disabled. Use slog.LogAttrs

panic/recover as control flow

panic allocates a stack trace and unwinds the stack; use error returns

unsafe without benchmark proof

Only justified when profiling shows >10% improvement in a verified hot path

No GC tuning in containers

Set GOMEMLIMIT to 80-90% of container memory to prevent OOM kills

reflect.DeepEqual in production

50-200x slower than typed comparison; use slices.Equal, maps.Equal, bytes.Equal

Deep Dives

  • Memory Optimization — allocation patterns, backing array leaks, sync.Pool, struct alignment
  • CPU Optimization — inlining, cache locality, false sharing, ILP, reflection avoidance
  • Runtime Tuning — GOGC, GOMEMLIMIT, GC diagnostics, GOMAXPROCS, PGO
  • Caching Patterns — algorithmic complexity, compiled patterns, singleflight, work avoidance

CI Regression Detection

Automate benchmark comparison in CI to catch regressions before they reach production. → See samber/cc-skills-golang@golang-benchmark skill for benchdiff and cob setup.

Cross-References

  • → See samber/cc-skills-golang@golang-benchmark skill for benchmarking methodology, benchstat, and b.Loop() (Go 1.24+)
  • → See samber/cc-skills-golang@golang-troubleshooting skill for pprof workflow, escape analysis diagnostics, and performance debugging
  • → See samber/cc-skills-golang@golang-data-structures skill for slice/map preallocation and strings.Builder
  • → See samber/cc-skills-golang@golang-concurrency skill for worker pools, sync.Pool API, goroutine lifecycle, and lock contention
  • → See samber/cc-skills-golang@golang-safety skill for defer in loops, slice backing array aliasing
  • → See samber/cc-skills-golang@golang-database skill for connection pool tuning and batch processing
  • → See samber/cc-skills-golang@golang-observability skill for continuous profiling in production
BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card