Quick Start¶
This guide shows the shortest path from installation to a posterior-weighted Skill context.
Bayesian-Agent supports three paths:
- start from scratch and evolve Skills during a full run
- repair only failed tasks from an existing agent run
- adapt the same Bayesian Skill registry to another harness through an adapter
Install¶
git clone https://github.com/DataArcTech/Bayesian-Agent.git
cd Bayesian-Agent
python -m pip install -e .
Bayesian-Agent v0.4 has no runtime dependencies beyond the Python standard library.
Update a Skill Registry¶
Use evolve to ingest one or more result files and update a persistent Bayesian Skill registry:
bayesian-agent evolve \
--results artifacts/ga_deepseek_baseline/sop_results.json \
--registry temp/bayesian_skill_beliefs.json \
--context-out temp/skill_context.md
This command:
- reads benchmark or agent run traces
- converts each run into
TrajectoryEvidence - updates the corresponding Skill posterior
- optionally renders reusable Skill context for a future run
Plan Incremental Repair¶
Use repair-plan to extract failed task ids from a baseline run:
bayesian-agent repair-plan \
--baseline artifacts/ga_deepseek_baseline/sop_results.json \
--out temp/failed_tasks.json
This is useful when Bayesian-Agent is attached to another agent as a repair layer.
Summarize Results¶
bayesian-agent summarize \
--results artifacts/bayesian_incremental/results.json \
--out temp/summary.json
For baseline plus repair traces:
bayesian-agent incremental-summary \
--baseline artifacts/ga_deepseek_baseline/sop_results.json \
--repairs artifacts/bayesian_incremental/results.json \
--out temp/incremental_summary.json
Python Example¶
from bayesian_agent import BayesianSkillRegistry, SkillContextBuilder, TrajectoryEvidence
registry = BayesianSkillRegistry("temp/beliefs.json")
registry.record(
TrajectoryEvidence(
task_id="sop_12",
skill_id="benchmark/sop_bench",
context="sop_bench",
outcome="failure",
failure_mode="xml_wrapped_answer",
input_tokens=70123,
output_tokens=4242,
)
)
print(SkillContextBuilder(registry).render(task_context="sop_bench"))
Expected Output Shape¶
Rendered context is intentionally short:
### Bayesian Skill Context
Use these posterior-weighted Skills/SOPs as hypotheses, not as unquestioned instructions.
- benchmark/sop_bench: posterior_success=0.333, alpha=1.0, beta=2.0, observations=1, mean_tokens=74365.0, rewrite=explore, failures=xml_wrapped_answer=1
Current task files and runtime feedback remain authoritative.