metaclaw-evolving-agent

Deploy and configure MetaClaw — an agent that meta-learns and evolves from live conversations using skills injection, RL training, and smart scheduling.

INSTALLATION
npx skills add https://github.com/aradotso/trending-skills --skill metaclaw-evolving-agent
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

$27

Google Calendar scheduler for madmax mode

pip install -e ".[scheduler]"

Recommended: everything

pip install -e ".[rl,evolve,scheduler]"

---

## Quick Start

One-time interactive config wizard

metaclaw setup

Start in default madmax mode (skills + RL + smart scheduler)

metaclaw start

Skills only — no GPU, no Tinker needed

metaclaw start --mode skills_only

RL mode — trains immediately when batch is full

metaclaw start --mode rl

RL without scheduler (same as above, explicit)

metaclaw start --mode rl


After `metaclaw start`, a local OpenAI-compatible proxy is running. Point your client (OpenClaw or any OpenAI SDK consumer) at `http://localhost:<port>` instead of the upstream LLM endpoint.

## Configuration

`metaclaw setup` writes a config file (default: `~/.metaclaw/config.yaml`). You can also edit it directly:

~/.metaclaw/config.yaml

proxy:

host: 0.0.0.0

port: 8080

llm:

provider: kimi # kimi | qwen | claude | minimax | openai | gemini

base_url: https://api.moonshot.cn/v1

model: moonshot-v1-8k

# api_key loaded from env: METACLAW_LLM_API_KEY

skills:

enabled: true

max_injected: 5 # max skills injected per turn

summarize_after_session: true

rl:

enabled: true

backend: auto # auto | tinker | mint

batch_size: 32

algorithm: grpo

opd_teacher: false # optional teacher distillation

scheduler: # madmax mode only

enabled: true

sleep_hours: [22, 7] # local 22:00–07:00

idle_timeout_minutes: 15

google_calendar: false # set true + configure OAuth for meeting detection

logging:

level: info

log_dir: ~/.metaclaw/logs


### Environment Variables

export METACLAW_LLM_API_KEY="your-llm-api-key"

export METACLAW_TINKER_API_KEY="your-tinker-api-key" # rl mode

export METACLAW_MINT_API_KEY="your-mint-api-key" # if backend=mint

export GOOGLE_CALENDAR_CREDENTIALS_PATH="path/to/creds.json" # scheduler


## Operating Modes

Mode
Command
GPU Required
Description

`skills_only`
`metaclaw start --mode skills_only`
No
Proxy + skills injection + auto-summarization

`rl`
`metaclaw start --mode rl`
Via API
Skills + GRPO training when batch fills

`madmax`
`metaclaw start`
Via API
Skills + RL + scheduler (trains only during idle/sleep/meetings)

## Python API

### Programmatic startup

import asyncio

from metaclaw import MetaClawAgent, AgentConfig, Mode

async def main():

config = AgentConfig.from_yaml("~/.metaclaw/config.yaml")

agent = MetaClawAgent(config, mode=Mode.MADMAX)

await agent.start()

asyncio.run(main())


### Manual skill injection

from metaclaw.skills import SkillStore, SkillInjector

store = SkillStore(path="~/.metaclaw/skills")

Add a skill manually

store.add(

name="code-review-checklist",

content="Always check for: 1) error handling, 2) type hints, 3) docstrings.",

tags=["code", "review"]

)

Retrieve top-k relevant skills for a query

injector = SkillInjector(store)

relevant = injector.retrieve(query="review my Python function", top_k=3)

for skill in relevant:

print(skill.name, skill.score)


### Intercepting and recording conversations

from metaclaw.proxy import ConversationInterceptor

from metaclaw.memory import ExperienceBuffer

buffer = ExperienceBuffer(max_size=1000)

interceptor = ConversationInterceptor(

upstream_url="https://api.moonshot.cn/v1",

on_complete=buffer.record # called after each turn with (messages, response)

)

buffer.record signature:

async def on_complete(messages: list[dict], response: dict) -> None:

...


### Triggering RL training manually

from metaclaw.training import RLTrainer, TrainingConfig

trainer = RLTrainer(

config=TrainingConfig(

backend="tinker", # or "mint"

algorithm="grpo",

batch_size=32,

lora_rank=16,

)

)

Collect a batch from the experience buffer and train

async def run_training(buffer):

batch = buffer.sample(n=32, split="support") # support/query separation

result = await trainer.train(batch)

print(f"Training complete. Loss: {result.loss:.4f}, Steps: {result.steps}")


### Reward modeling

from metaclaw.rewards import RewardModel

reward_model = RewardModel(provider="llm") # uses configured LLM for scoring

async def score_turn(prompt: str, response: str) -> float:

score = await reward_model.score(prompt=prompt, response=response)

return score # float in [-1.0, 1.0]


## Skills Lifecycle

Conversation turn

SkillInjector.retrieve() ← vector search over SkillStore

│ injects top-k skills into system prompt

LLM responds

ExperienceBuffer.record() ← stores (context, response, metadata)

▼ (end of session)

SkillSummarizer.run() ← LLM extracts reusable patterns

SkillStore.upsert() ← new/updated skills persisted to disk


## Integration: OpenAI SDK as Client

Point any OpenAI SDK client at the MetaClaw proxy:

from openai import OpenAI

MetaClaw proxy is running on localhost:8080

client = OpenAI(

base_url="http://localhost:8080/v1",

api_key="not-used-but-required-by-sdk"

)

response = client.chat.completions.create(

model="moonshot-v1-8k", # passed through to upstream

messages=[

{"role": "user", "content": "Review my pull request strategy."}

]

)

print(response.choices[0].message.content)


Skills are injected transparently — the client code does not change.

## Scheduler (MadMax Mode)

The scheduler ensures RL weight updates never interrupt active use:

from metaclaw.scheduler import MadMaxScheduler, SchedulerConfig

scheduler = MadMaxScheduler(

config=SchedulerConfig(

sleep_hours=(22, 7), # train between 22:00–07:00 local time

idle_timeout_minutes=15, # train after 15 min of no conversations

google_calendar=True, # also train during calendar meetings

credentials_path="creds.json"

)

)

Check if it's safe to train right now

if await scheduler.is_training_window():

await trainer.train(batch)


### Google Calendar Setup

1. Enable Google Calendar API in Google Cloud Console

2. Download OAuth2 credentials as creds.json

3. Set path in config or env

export GOOGLE_CALENDAR_CREDENTIALS_PATH="/path/to/creds.json"

4. First run will open browser for OAuth consent

metaclaw start


## Support/Query Set Separation

MetaClaw separates experience into support and query sets to prevent stale rewards from polluting updates:

from metaclaw.memory import ExperienceBuffer

buffer = ExperienceBuffer(

max_size=2000,

support_ratio=0.5 # 50% support, 50% query

)

During training:

support_batch = buffer.sample(n=16, split="support") # used to compute reward signal

query_batch = buffer.sample(n=16, split="query") # used for gradient update

await trainer.train_meta(support=support_batch, query=query_batch)


## RL Backends

### Tinker (default)

rl:

backend: tinker

tinker_project: my-metaclaw-project

lora_rank: 16

learning_rate: 1e-4


### MinT

Install MinT compatibility layer separately

pip install metaclaw-mint

rl:

backend: mint

mint_endpoint: https://your-mint-endpoint


### Auto-detection

rl:

backend: auto # tries tinker first, falls back to mint, errors if neither available


## Troubleshooting

**Proxy not reachable after `metaclaw start`**

- Check port conflicts: `lsof -i :8080`

- Change `proxy.port` in config and restart

**`rl` mode: "No training backend available"**

- Ensure `pip install -e ".[rl]"` completed successfully

- Verify `METACLAW_TINKER_API_KEY` or `METACLAW_MINT_API_KEY` is set

- Try `rl.backend: tinker` explicitly instead of `auto`

**Skills not persisting between sessions**

- Confirm `skills.summarize_after_session: true` in config

- Check write permissions on `~/.metaclaw/skills/`

- Run `metaclaw skills list` to inspect stored skills

**Madmax mode never trains**

- Verify `scheduler.sleep_hours` covers your timezone's night

- Lower `scheduler.idle_timeout_minutes` for testing (e.g., `1`)

- Check scheduler logs: `~/.metaclaw/logs/scheduler.log`

**Google Calendar integration fails**

- Re-run OAuth flow: delete `~/.metaclaw/token.json` and restart

- Ensure Calendar API is enabled in your Google Cloud project

**OPD teacher distillation errors**

- Only supported with `rl.backend: tinker`

- Requires a separate teacher model endpoint in config:

rl:

opd_teacher: true

teacher_base_url: https://api.openai.com/v1

teacher_model: gpt-4o


## CLI Reference

metaclaw setup # interactive config wizard

metaclaw start # start in madmax mode

metaclaw start --mode skills_only

metaclaw start --mode rl

metaclaw start --config path/to/config.yaml

metaclaw skills list # show all stored skills

metaclaw skills delete <name> # remove a skill

metaclaw skills export skills.json

metaclaw status # show proxy, scheduler, training status

metaclaw logs # tail all logs

metaclaw logs --component scheduler

BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card