Bayesian Self-Evolving Agent Method¶
Bayesian-Agent treats each Skill or SOP as a hypothesis about agent success under a task context. The method is harness-agnostic: it can bootstrap Skills in a full run, repair existing agents incrementally, or transfer the same posterior Skill registry across compatible harnesses.
theta: frozen base model parametersC: inference condition, including prompt, memory, tools, retrieved context, and harness feedbackh: Skill/SOP hypothesis
The framework does not train the base model and does not require replacing the agent runtime. It changes the inference environment by maintaining posterior-weighted Skill context that can be injected through adapters.
Evidence¶
Each agent run emits TrajectoryEvidence:
- task id
- skill id
- context
- success or failure outcome
- token counts
- latency and turns
- failure mode
- task metadata
Evidence should be action-verified. For example, a benchmark grader, unit test, or deterministic checker should decide whether a run succeeded.
Belief Update¶
Each Skill uses a Beta posterior:
The registry also tracks cost, context distribution, and failure modes. These statistics guide what gets injected into future context.
Rewrite Policy¶
The default policy maps posterior state to actions:
compress: repeated success suggests the Skill is stablepatch: failures cluster around a recurring failure modesplit: evidence spans different contextsretire: failures dominate the posteriorexplore: evidence is still sparse or uncertain
The policy is intentionally small in v0.4. It is designed to be replaced by project-specific policies.
Full Mode¶
Full self-evolving mode runs all tasks and updates Skill beliefs online. This mode tests whether Bayesian Skill Evolution can improve an agent from scratch.
Incremental Repair Mode¶
Incremental repair mode starts from a baseline agent's traces:
This mode is the recommended production path because it adds Bayesian-Agent as a plug-in repair layer instead of replacing the base agent.