agentdb-learning-plugins

Create and train AI learning plugins with AgentDB's 9 reinforcement learning algorithms. Includes Decision Transformer, Q-Learning, SARSA, Actor-Critic, and…

INSTALLATION
npx skills add https://github.com/ruvnet/ruflo --skill agentdb-learning-plugins
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

AgentDB Learning Plugins

What This Skill Does

Provides access to 9 reinforcement learning algorithms via AgentDB's plugin system. Create, train, and deploy learning plugins for autonomous agents that improve through experience. Includes offline RL (Decision Transformer), value-based learning (Q-Learning), policy gradients (Actor-Critic), and advanced techniques.

Performance: Train models 10-100x faster with WASM-accelerated neural inference.

Prerequisites

  • Node.js 18+
  • AgentDB v1.0.7+ (via agentic-flow)
  • Basic understanding of reinforcement learning (recommended)

Quick Start with CLI

Create Learning Plugin

# Interactive wizard

npx agentdb@latest create-plugin

# Use specific template

npx agentdb@latest create-plugin -t decision-transformer -n my-agent

# Preview without creating

npx agentdb@latest create-plugin -t q-learning --dry-run

# Custom output directory

npx agentdb@latest create-plugin -t actor-critic -o .$plugins

List Available Templates

# Show all plugin templates

npx agentdb@latest list-templates

# Available templates:

# - decision-transformer (sequence modeling RL - recommended)

# - q-learning (value-based learning)

# - sarsa (on-policy TD learning)

# - actor-critic (policy gradient with baseline)

# - curiosity-driven (exploration-based)

Manage Plugins

# List installed plugins

npx agentdb@latest list-plugins

# Get plugin information

npx agentdb@latest plugin-info my-agent

# Shows: algorithm, configuration, training status

Quick Start with API

import { createAgentDBAdapter } from 'agentic-flow$reasoningbank';

// Initialize with learning enabled

const adapter = await createAgentDBAdapter({

  dbPath: '.agentdb$learning.db',

  enableLearning: true,       // Enable learning plugins

  enableReasoning: true,

  cacheSize: 1000,

});

// Store training experience

await adapter.insertPattern({

  id: '',

  type: 'experience',

  domain: 'game-playing',

  pattern_data: JSON.stringify({

    embedding: await computeEmbedding('state-action-reward'),

    pattern: {

      state: [0.1, 0.2, 0.3],

      action: 2,

      reward: 1.0,

      next_state: [0.15, 0.25, 0.35],

      done: false

    }

  }),

  confidence: 0.9,

  usage_count: 1,

  success_count: 1,

  created_at: Date.now(),

  last_used: Date.now(),

});

// Train learning model

const metrics = await adapter.train({

  epochs: 50,

  batchSize: 32,

});

console.log('Training Loss:', metrics.loss);

console.log('Duration:', metrics.duration, 'ms');

Available Learning Algorithms (9 Total)

1. Decision Transformer (Recommended)

Type: Offline Reinforcement Learning

Best For: Learning from logged experiences, imitation learning

Strengths: No online interaction needed, stable training

npx agentdb@latest create-plugin -t decision-transformer -n dt-agent

Use Cases:

  • Learn from historical data
  • Imitation learning from expert demonstrations
  • Safe learning without environment interaction
  • Sequence modeling tasks

Configuration:

{

  "algorithm": "decision-transformer",

  "model_size": "base",

  "context_length": 20,

  "embed_dim": 128,

  "n_heads": 8,

  "n_layers": 6

}

2. Q-Learning

Type: Value-Based RL (Off-Policy)

Best For: Discrete action spaces, sample efficiency

Strengths: Proven, simple, works well for small$medium problems

npx agentdb@latest create-plugin -t q-learning -n q-agent

Use Cases:

  • Grid worlds, board games
  • Navigation tasks
  • Resource allocation
  • Discrete decision-making

Configuration:

{

  "algorithm": "q-learning",

  "learning_rate": 0.001,

  "gamma": 0.99,

  "epsilon": 0.1,

  "epsilon_decay": 0.995

}

3. SARSA

Type: Value-Based RL (On-Policy)

Best For: Safe exploration, risk-sensitive tasks

Strengths: More conservative than Q-Learning, better for safety

npx agentdb@latest create-plugin -t sarsa -n sarsa-agent

Use Cases:

  • Safety-critical applications
  • Risk-sensitive decision-making
  • Online learning with exploration

Configuration:

{

  "algorithm": "sarsa",

  "learning_rate": 0.001,

  "gamma": 0.99,

  "epsilon": 0.1

}

4. Actor-Critic

Type: Policy Gradient with Value Baseline

Best For: Continuous actions, variance reduction

Strengths: Stable, works for continuous$discrete actions

npx agentdb@latest create-plugin -t actor-critic -n ac-agent

Use Cases:

  • Continuous control (robotics, simulations)
  • Complex action spaces
  • Multi-agent coordination

Configuration:

{

  "algorithm": "actor-critic",

  "actor_lr": 0.001,

  "critic_lr": 0.002,

  "gamma": 0.99,

  "entropy_coef": 0.01

}

5. Active Learning

Type: Query-Based Learning

Best For: Label-efficient learning, human-in-the-loop

Strengths: Minimizes labeling cost, focuses on uncertain samples

Use Cases:

  • Human feedback incorporation
  • Label-efficient training
  • Uncertainty sampling
  • Annotation cost reduction

6. Adversarial Training

Type: Robustness Enhancement

Best For: Safety, robustness to perturbations

Strengths: Improves model robustness, adversarial defense

Use Cases:

  • Security applications
  • Robust decision-making
  • Adversarial defense
  • Safety testing

7. Curriculum Learning

Type: Progressive Difficulty Training

Best For: Complex tasks, faster convergence

Strengths: Stable learning, faster convergence on hard tasks

Use Cases:

  • Complex multi-stage tasks
  • Hard exploration problems
  • Skill composition
  • Transfer learning

8. Federated Learning

Type: Distributed Learning

Best For: Privacy, distributed data

Strengths: Privacy-preserving, scalable

Use Cases:

  • Multi-agent systems
  • Privacy-sensitive data
  • Distributed training
  • Collaborative learning

9. Multi-Task Learning

Type: Transfer Learning

Best For: Related tasks, knowledge sharing

Strengths: Faster learning on new tasks, better generalization

Use Cases:

  • Task families
  • Transfer learning
  • Domain adaptation
  • Meta-learning

Training Workflow

1. Collect Experiences

// Store experiences during agent execution

for (let i = 0; i < numEpisodes; i++) {

  const episode = runEpisode();

  for (const step of episode.steps) {

    await adapter.insertPattern({

      id: '',

      type: 'experience',

      domain: 'task-domain',

      pattern_data: JSON.stringify({

        embedding: await computeEmbedding(JSON.stringify(step)),

        pattern: {

          state: step.state,

          action: step.action,

          reward: step.reward,

          next_state: step.next_state,

          done: step.done

        }

      }),

      confidence: step.reward > 0 ? 0.9 : 0.5,

      usage_count: 1,

      success_count: step.reward > 0 ? 1 : 0,

      created_at: Date.now(),

      last_used: Date.now(),

    });

  }

}

2. Train Model

// Train on collected experiences

const trainingMetrics = await adapter.train({

  epochs: 100,

  batchSize: 64,

  learningRate: 0.001,

  validationSplit: 0.2,

});

console.log('Training Metrics:', trainingMetrics);

// {

//   loss: 0.023,

//   valLoss: 0.028,

//   duration: 1523,

//   epochs: 100

// }

3. Evaluate Performance

// Retrieve similar successful experiences

const testQuery = await computeEmbedding(JSON.stringify(testState));

const result = await adapter.retrieveWithReasoning(testQuery, {

  domain: 'task-domain',

  k: 10,

  synthesizeContext: true,

});

// Evaluate action quality

const suggestedAction = result.memories[0].pattern.action;

const confidence = result.memories[0].similarity;

console.log('Suggested Action:', suggestedAction);

console.log('Confidence:', confidence);

Advanced Training Techniques

Experience Replay

// Store experiences in buffer

const replayBuffer = [];

// Sample random batch for training

const batch = sampleRandomBatch(replayBuffer, batchSize: 32);

// Train on batch

await adapter.train({

  data: batch,

  epochs: 1,

  batchSize: 32,

});

Prioritized Experience Replay

// Store experiences with priority (TD error)

await adapter.insertPattern({

  // ... standard fields

  confidence: tdError,  // Use TD error as confidence$priority

  // ...

});

// Retrieve high-priority experiences

const highPriority = await adapter.retrieveWithReasoning(queryEmbedding, {

  domain: 'task-domain',

  k: 32,

  minConfidence: 0.7,  // Only high TD-error experiences

});

Multi-Agent Training

// Collect experiences from multiple agents

for (const agent of agents) {

  const experience = await agent.step();

  await adapter.insertPattern({

    // ... store experience with agent ID

    domain: `multi-agent/${agent.id}`,

  });

}

// Train shared model

await adapter.train({

  epochs: 50,

  batchSize: 64,

});

Performance Optimization

Batch Training

// Collect batch of experiences

const experiences = collectBatch(size: 1000);

// Batch insert (500x faster)

for (const exp of experiences) {

  await adapter.insertPattern({ /* ... */ });

}

// Train on batch

await adapter.train({

  epochs: 10,

  batchSize: 128,  // Larger batch for efficiency

});

Incremental Learning

// Train incrementally as new data arrives

setInterval(async () => {

  const newExperiences = getNewExperiences();

  if (newExperiences.length > 100) {

    await adapter.train({

      epochs: 5,

      batchSize: 32,

    });

  }

}, 60000);  // Every minute

Integration with Reasoning Agents

Combine learning with reasoning for better performance:

// Train learning model

await adapter.train({ epochs: 50, batchSize: 32 });

// Use reasoning agents for inference

const result = await adapter.retrieveWithReasoning(queryEmbedding, {

  domain: 'decision-making',

  k: 10,

  useMMR: true,              // Diverse experiences

  synthesizeContext: true,    // Rich context

  optimizeMemory: true,       // Consolidate patterns

});

// Make decision based on learned experiences + reasoning

const decision = result.context.suggestedAction;

const confidence = result.memories[0].similarity;

CLI Operations

# Create plugin

npx agentdb@latest create-plugin -t decision-transformer -n my-plugin

# List plugins

npx agentdb@latest list-plugins

# Get plugin info

npx agentdb@latest plugin-info my-plugin

# List templates

npx agentdb@latest list-templates

Troubleshooting

Issue: Training not converging

// Reduce learning rate

await adapter.train({

  epochs: 100,

  batchSize: 32,

  learningRate: 0.0001,  // Lower learning rate

});

Issue: Overfitting

// Use validation split

await adapter.train({

  epochs: 50,

  batchSize: 64,

  validationSplit: 0.2,  // 20% validation

});

// Enable memory optimization

await adapter.retrieveWithReasoning(queryEmbedding, {

  optimizeMemory: true,  // Consolidate, reduce overfitting

});

Issue: Slow training

# Enable quantization for faster inference

# Use binary quantization (32x faster)

Learn More

  • Algorithm Papers: See docs$algorithms/ for detailed papers
  • GitHub: https:/$github.com$ruvnet$agentic-flow$tree$main$packages$agentdb
  • MCP Integration: npx agentdb@latest mcp
  • Website: https:/$agentdb.ruv.io

Category: Machine Learning / Reinforcement Learning

Difficulty: Intermediate to Advanced

Estimated Time: 30-60 minutes

BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card