Skip to content

PeiranLi0930/Plugin-GBT

Repository files navigation

GBT (Gated Behavior Tree) Plugin

GBT turns OpenClaw's throwaway execution logs into a reusable experience tree.

That changes the economics of agent work.

Without GBT, every similar task forces the agent to plan again, reason again, debug again, and spend expensive tokens again. With GBT, successful runs and failed runs are distilled into reusable operational memory. The next time a similar task appears, OpenClaw can stop wasting tokens on rediscovering the same path and instead execute against a learned tree of concrete experience.

And when a covered task fails, GBT does not just shrug and move on. It queues that failed trajectory, waits for idle time, asks for approval, silently replays the task inside OpenClaw, localizes the broken step, repairs it, verifies the repaired run against the real environment, and only then writes the repaired experience back into the tree.

This is not a prompt wrapper. It is a persistent experience system for OpenClaw.

Research Origin

This plugin is built on the core ideas from the paper:

Traversal-as-Policy: Log-Distilled Gated Behavior Trees as Externalized, Verifiable Policies for Safe, Robust, and Efficient Agents
Peiran Li, Jiashuo Sun, Fangzhou Lin, Shuo Xing, Tianfu Fu, Suofei Feng, Chaoqun Ni, Zhengzhong Tu
https://arxiv.org/abs/2603.05517

The plugin version in this repository focuses on turning agent logs into reusable experience trees, routing covered tasks onto cheaper single-step executors, and replaying failed trajectories to repair them into reusable experience. As noted below, this current release does not implement the paper's safety-gate mechanism.

What GBT Actually Does

  • Distills completed OpenClaw runs into reusable, non-task-specific macro nodes.
  • Stores both success and failure paths instead of discarding failure.
  • Switches covered tasks onto a cheaper executor model when existing experience is strong enough.
  • Injects step-local runtime guidance so the executor can act like a disciplined single-step worker instead of a full long-horizon planner.
  • Tracks runtime spine progress and recovery hints during execution.
  • Queues failed covered runs for idle-time self-evolution.
  • Replays failed tasks in fresh OpenClaw sessions, extracts the real transcript, verifies the repair, and only then writes the repaired path back into the tree.

Why This Is Useful

  • Lower token cost on repeated task families.
  • Less repeated planning on known workflows.
  • Better reuse of hard-won debugging work.
  • Stronger long-term personalization: each user grows their own tree from their own OpenClaw history.
  • Better use of expensive reasoning models: spend them on repair and evolution, not on re-solving the same problem forever.

Current Scope

This release fully implements the core GBT and GBT-SE workflow for reusable experience distillation, covered-task guidance, and replay-backed self-evolution.

This release intentionally does not implement the paper's safety-gate system.

Prerequisites

Before installing GBT, make sure you have:

  1. OpenClaw installed and working.
  2. Node.js available in the same environment that runs OpenClaw.
  3. Python 3.10+ available as python3 or another executable you can point the plugin to.
  4. The Python openai package installed.
  5. At least one OpenClaw-capable model configured for your normal runs.
  6. OpenAI auth available if you want the bundled Python analysis path to do LLM-backed distillation / verification.

Install the Python dependency:

python3 -m pip install --upgrade openai

Step-by-Step Setup

1. Build or pack the plugin

From this repository:

npm install
npm run build

If you want a tarball for installation:

npm pack

That produces a file like gbt-skill-0.1.0.tgz.

2. Install the plugin into OpenClaw

Install from the local repository:

openclaw plugins install .

Or install from the packed tarball:

openclaw plugins install ./gbt-skill-0.1.0.tgz

After install, restart OpenClaw if your setup requires a restart for plugin discovery.

3. Enable the plugin

Enable it through OpenClaw:

openclaw plugins enable gbt-skill

You can confirm it is visible with:

openclaw plugins list
openclaw plugins info gbt-skill

4. Configure the plugin

Add plugin config under plugins.entries.gbt-skill.config in your OpenClaw config.

Minimal example:

{
  "plugins": {
    "entries": {
      "gbt-skill": {
        "enabled": true,
        "config": {
          "pythonExecutable": "python3",
          "stateSubdir": "gbt-skill",
          "cheaperModel": "gpt-4.1-mini",
          "cheaperProvider": "openai",
          "coverageThreshold": 0.6,
          "idleMinutes": 10
        }
      }
    }
  }
}

What the main config values mean:

  • pythonExecutable: Python executable used to run the GBT engine.
  • stateSubdir: Where GBT stores its tree, episodes, and self-evolve state.
  • cheaperModel: Model used for covered tasks once GBT has reusable experience.
  • cheaperProvider: Optional provider override for that cheaper executor.
  • coverageThreshold: How confident GBT must be before switching into guided executor mode.
  • idleMinutes: How long OpenClaw must stay idle before GBT asks whether to start self-evolution.
  • distillModel: Optional analysis-model override. Leave it empty if you want GBT to inherit the model from the current OpenClaw run.
  • selfEvolveReplayCommand: Optional override. Leave it empty to use GBT's built-in OpenClaw replay runner.
  • selfEvolveReplayCwd: Optional working directory for a custom replay command.
  • selfEvolveReplayTimeoutSec: Timeout for replay verification.

5. Decide whether you need an explicit analysis-model override

For most OpenAI-backed OpenClaw setups, you do not need to set distillModel.

GBT will inherit the main model from the run that just happened, and the bundled Python analysis engine will normalize OpenClaw model refs like openai/gpt-5.4 into the directly callable OpenAI model name.

Set distillModel only if one of these is true:

  • you want GBT's internal distill / diagnose / replay-verification steps to use a different model than your main OpenClaw run
  • your main OpenClaw model is not directly callable by the bundled Python OpenAI client

Example override:

{
  "plugins": {
    "entries": {
      "gbt-skill": {
        "enabled": true,
        "config": {
          "distillModel": "gpt-5.4"
        }
      }
    }
  }
}

6. Make sure analysis auth is available when needed

The built-in replay runner uses OpenClaw's embedded replay path plus the bundled Python analysis client for strict verification when a directly usable analysis model is available.

If your OpenClaw auth store does not already have an OpenAI profile, set an API key in the shell before replay or configure OpenClaw auth for OpenAI.

At minimum:

export OPENAI_API_KEY="your-key"

GBT's built-in runner will register openai:default in the local OpenClaw auth store if needed during replay.

If you leave distillModel empty and your main OpenClaw runs are not OpenAI-backed, replay still works, but some verification / diagnosis steps may fall back to heuristic analysis unless you provide an explicit OpenAI analysis model.

7. Start using OpenClaw normally

You do not need a special launch mode.

Once the plugin is enabled:

  • completed runs are distilled into the tree
  • covered tasks can route onto the cheaper executor
  • failed covered runs are queued for self-evolution

GBT is meant to sit inside the normal OpenClaw loop, not beside it.

First Run: What To Expect

After normal task completion

GBT will:

  1. Normalize the tool log.
  2. Segment it into macro steps.
  3. Distill reusable summaries and metadata.
  4. Add those nodes and paths into the persistent experience tree.

When a new task is already covered

GBT will:

  1. Match the task against the tree.
  2. Decide whether confidence is high enough.
  3. If covered, switch to your configured cheaper executor model.
  4. Inject macro-by-macro execution guidance into the prompt.

When a covered task fails

GBT will:

  1. Preserve the failed trajectory.
  2. Queue it for self-evolution.
  3. Wait until OpenClaw has been idle for the configured window.
  4. Ask you whether to start self-evolution.
  5. If approved, silently replay the task in a fresh OpenClaw session.
  6. Verify the repair using real transcript evidence before writing it back into the tree.

User Commands

GBT exposes these commands:

  • /gbt status
  • /gbt match <task>
  • /gbt evolve approve
  • /gbt evolve reject

Recommended Configuration Pattern

Use your normal strong model for the main OpenClaw run, and reserve the cheaper model only for covered execution.

Example:

  • main OpenClaw model: stronger reasoning model
  • distillModel: leave empty to inherit the main OpenClaw model, or set it explicitly only if you want a different analysis model
  • cheaperModel: inexpensive executor model

That gives you the intended split:

  • strong models for original execution and repair analysis
  • cheaper models for repeated covered work

Verifying That GBT Is Really Working

Check build and tests:

npm test
npm run build

Inspect plugin state:

/gbt status

Ask whether a task is covered:

/gbt match fix the failing parser test by editing the parser file and rerunning pytest

If GBT is active, you should start seeing:

  • increasing node and episode counts
  • covered-task matches with confidence scores
  • failed covered jobs entering the self-evolve queue

Packaging and Public Release Notes

This package is a real OpenClaw plugin:

  • package.json declares openclaw.extensions = ["./dist/index.js"]
  • openclaw.plugin.json defines plugin metadata and config schema
  • the bundled plugin prompt assets ship under skills/gbt
  • the Python engine ships inside the package under gbt_skill
  • the built-in self-evolve replay runner is included and enabled by default

Release Validation

The current release has been validated with:

  • pytest -q
  • npm test
  • npm run build
  • real OpenClaw embedded replay smoke runs using the built-in replay runner

Practical Limits

GBT is powerful, but the system is still constrained by the underlying runtime:

  • if OpenClaw itself cannot access the required tools or workspace, replay cannot fix that
  • if your model/provider auth is missing, self-evolve replay cannot run
  • very hard long-horizon repair cases may need multiple replay attempts

In One Sentence

GBT turns OpenClaw from a stateless executor that keeps relearning the same lessons into a system that remembers, reuses, repairs, and gets cheaper on the work it has already paid to understand.

Citation

If you find GBT useful in your work, please consider citing:

@misc{li2026traversalaspolicylogdistilledgatedbehavior,
      title={Traversal-as-Policy: Log-Distilled Gated Behavior Trees as Externalized, Verifiable Policies for Safe, Robust, and Efficient Agents},
      author={Peiran Li and Jiashuo Sun and Fangzhou Lin and Shuo Xing and Tianfu Fu and Suofei Feng and Chaoqun Ni and Zhengzhong Tu},
      year={2026},
      eprint={2603.05517},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2603.05517},
}

About

[!NEW!] OpenClaw Plugin for Agents to Become Clever and Token-Efficient!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors