symbolic-equation

Discover scientific equations from data using LLM-guided evolutionary search (LLM-SR). Multi-island algorithm with softmax-based cluster sampling, island…

INSTALLATION
npx skills add https://github.com/lingzhi227/agent-research-skills --skill symbolic-equation
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

Symbolic Equation Discovery

Discover interpretable scientific equations from data using LLM-guided evolutionary search.

Input

  • $0 — Dataset description, variable names, and physical context

References

  • LLM-SR patterns (prompts, evolution, sampling): ~/.claude/skills/symbolic-equation/references/llmsr-patterns.md

Workflow (from LLM-SR)

Step 1: Define Problem Specification

Create a specification with:

  • Input variables: Physical quantities with types (e.g., x: np.ndarray, v: np.ndarray)
  • Output variable: Target quantity to predict
  • Evaluation function: Fitness metric (typically negative MSE with parameter optimization)
  • Physical context: Domain knowledge to guide equation discovery
# Example specification

@equation.evolve

def equation(x: np.ndarray, v: np.ndarray, params: np.ndarray) -> np.ndarray:

    """Describe the acceleration of a damped nonlinear oscillator."""

    return params[0] * x

Step 2: Initialize Multi-Island Buffer

  • Create N islands (default: 10) for population diversity
  • Each island maintains independent clusters of equations
  • Clusters group equations by performance signature

Step 3: Evolutionary Search Loop

Repeat until convergence or max samples:

  • Select island: Random island selection
  • Build prompt: Sample top equations from clusters (softmax-weighted by score)
  • LLM proposes: Generate new equation as improved version
  • Evaluate: Execute on test data, compute fitness score
  • Register: Add to island's cluster if valid

Step 4: Prompt Construction

Present previous equations as versioned sequence:

def equation_v0(x, v, params):

    """Initial version."""

    return params[0] * x

def equation_v1(x, v, params):

    """Improved version of equation_v0."""

    return params[0] * x + params[1] * v

def equation_v2(x, v, params):

    """Improved version of equation_v1."""

    # LLM completes this

Step 5: Island Reset (Diversity Maintenance)

Periodically (default: every 4 hours):

  • Sort islands by best score
  • Reset bottom 50% of islands
  • Seed each reset island with best equation from a surviving island
  • Restart cluster sampling temperature

Step 6: Extract Best Equations

After search completes:

  • Collect best equation from each island
  • Rank by fitness score
  • Simplify if possible (algebraic simplification)
  • Report with physical interpretation

Cluster Sampling

Temperature-scheduled softmax over cluster scores:

temperature = T_init * (1 - (num_programs % period) / period)

probabilities = softmax(cluster_scores / temperature)
  • Higher temperature → more exploration
  • Lower temperature → more exploitation of best clusters
  • Within clusters: shorter programs are preferred (Occam's razor)

Rules

  • Equations must use only standard mathematical operations
  • Parameter optimization via scipy BFGS or Adam
  • Fitness = negative MSE (higher is better)
  • Timeout protection for equation evaluation
  • No recursive equations allowed
  • Physical interpretability is preferred over pure fit

Related Skills

BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card