What's New
A quick-scan guide to what has landed in each major release. Start here when returning after time away — each bullet links to the relevant documentation.
v0.11.x — Production tooling + full observability (May 2026)
Section titled “v0.11.x — Production tooling + full observability (May 2026)”The focus: developer tooling that makes agents production-observable and repeatable, plus the first create-reactive-agent scaffolder, cross-runtime support, and three new capabilities (code-action strategy, skill persistence, interactive playground).
New packages
Section titled “New packages”@reactive-agents/observe— Zero-config OpenTelemetry tracing. SetOTEL_EXPORTER_OTLP_ENDPOINTand every run emits a workflow → LLM → tool span hierarchy, OpenInference-compliant, to any OTLP backend (Jaeger, Grafana Tempo, Langfuse, Arize Phoenix). See OpenTelemetry Tracing.@reactive-agents/replay— Deterministic trace replay. Record any run to a snapshot file and re-run it with a different model or prompt without calling the LLM again. Enables regression testing and prompt A/B comparisons. See Snapshot & Replay.@reactive-agents/runtime-shim— Cross-runtime support. The framework now runs on Node.js 22.5+ in addition to Bun. Provides unifiedDatabase,spawn,serve,glob,writeFile,readFile, andhashprimitives that delegate to the available runtime. FTS5 is optional — falls back to LIKE-based search on Node’s built-in SQLite. Unblocks Stackblitz WebContainers (Node-only) and Vercel/Netlify deployments.
New tooling
Section titled “New tooling”create-reactive-agentCLI —bunx create-reactive-agent my-appscaffolds a runnable agent project in seconds. Supports--template minimal|standard|tool-use|multi-agent|gateway,--provider,--model,--pm bun|npm|yarn|pnpm. See create-reactive-agent.
Interactive Playground
Section titled “Interactive Playground”Three live Stackblitz scenarios, zero install. Runs fully in-browser via WebContainers — no local runtime required. Default provider is Google Gemini (free tier).
| Scenario | What it shows |
|---|---|
| Hello Agent | Simple Q&A — minimal builder, one-step response |
| Tool Integration | Built-in code-execute + scratchpad tools working together |
| Strategy Demo | reactive vs plan-execute-reflect side-by-side on the same task |
See Playground.
code-action strategy (@experimental)
Section titled “code-action strategy (@experimental)”A 7th reasoning strategy in which the LLM generates a TypeScript IIFE that runs inside a Worker-thread sandbox. Tools are exposed as normal async functions and called via postMessage round-trips — no JSON schema juggling in the prompt. Best suited for multi-tool orchestration tasks where expressing control flow in code is cleaner than iterative tool calls.
Enable with defaultStrategy: "code-action". ToolService is optional; the strategy also handles pure computation tasks. See code-action.
Skill persistence
Section titled “Skill persistence”Learned SkillRecord objects now survive process restarts. The skill system uses a dual-store: the existing in-memory session store for fast within-run access, plus a new SQLite-backed SkillStore that persists across runs. On cold start, skills are resolved from the persistent store before any LLM call. skillFragmentToSkillRecord() is exported from reactive-agents for manual skill construction.
New runtime controls
Section titled “New runtime controls”-
RunHandle—runStream()now returns aRunHandlewith four controls and a status property:.pause()— suspends the loop at the next safe checkpoint.resume()— resumes a paused run.stop()— graceful shutdown: finishes the current step, then runs output synthesis.terminate()— immediate abort, skips synthesis.status—"running" | "paused" | "stopped" | "terminated" | "completed".result—Promisethat resolves when the run reaches a terminal state
See Compose API.
-
Killswitches — Six factory functions from
@reactive-agents/composethat wire stopping conditions into the agent loop. Pass them to.compose()or.withHarness():import { maxIterations, budgetLimit, timeoutAfter, watchdog, requireApprovalFor } from "@reactive-agents/compose";Factory Stops when… maxIterations(n)Loop count reaches nbudgetLimit({ maxTokens?, maxCostUSD? })Token or cost ceiling hit timeoutAfter(duration)Wall-clock duration exceeded watchdog({ timeout })No progress within timeoutrequireApprovalFor(toolName, approver)Named tool needs human approval See Compose API.
-
Compose API (
@stable) —.compose(fn)(alias:.withHarness(fn)) attaches a harness transform that intercepts tagged chokepoints (prompt.system,nudge.loop-detected,message.tool-result, etc.) viah.on(),h.tap(),h.before(),h.after(), andh.onError(). Existing builder methods.withSystemPrompt(),.withErrorHandler(), and.withHook()now desugar through the harness. See Compose API and Harness Tags.
Strategy switching on by default
Section titled “Strategy switching on by default”enableStrategySwitching now defaults to true. The reactive intelligence dispatcher will switch strategies automatically when entropy signals a stuck loop — no explicit opt-in required.
Decision tracing
Section titled “Decision tracing”Agents can capture the model’s stated why for every tool call. Tool-call rationale on the reactive/adaptive paths is opt-in (audit feature, not performance — pure token/latency cost):
auditRationaleopt-in —.withReasoning({ auditRationale: true })(or envRA_RATIONALE_AUDIT=1). When on, the kernel coaxes one<rationale call="N">{"why":"…","confidence":0-1}</rationale>block per tool call. Off by default.- Native function-calling capture —
parseRationaleBlocks()reads side-channel blocks fromthought+thinkingcontent and attaches each rationale to the matchingToolCallSpecby position. The parser tolerates fenced/prose-wrapped JSON, over-lengthwhy, and repeatedcall="N"attributes, so capture is reliable on small local models. - plan-execute-reflect enforcement (always on) —
LLMPlanStepSchemacarries arationale: { why, confidence? }field, MANDATORY for everytool_callstep (independent ofauditRationale). Failures after retry emit aplan_rationale_missingmetric — no synthetic fallback invented. AgentDebrief.rationale[]— Unified milestone-decision log: tool selections, curator decisions, strategy switches, reactive interventions, and terminations. All render indebrief.markdownunder## Decision Rationale.
See Decision Tracing for the full pipeline and Debrief & Chat for the result shape.
Context-window override (numCtx)
Section titled “Context-window override (numCtx)”Pin the exact context window the provider receives instead of relying on the model’s assumed maximum:
.withModel({ model, numCtx })—numCtxmaps to Ollama’snum_ctx; cloud providers without a context-window knob ignore it. Now a first-classAgentConfigfield, so it round-trips throughtoConfig()/fromJSON()and the config-driven path. See Builder API and Local Models.- Cortex Studio — exposed as a Context length (
numCtx) field in the Lab Builder’s Inference section, and used as the authoritative denominator for the context-usage gauge.
Cortex rich-trace debugger
Section titled “Cortex rich-trace debugger”The Cortex Run View’s Trace Panel adds a Timeline view: a fine-grained, filterable, chronological event stream (LLM exchanges with prompt-cache %, tool calls, strategy switches, verifier verdicts, guards) grouped by iteration, reusing the same TraceEvent model as rax diagnose. The classic per-iteration Frames view remains a click away. See Cortex.
v0.10.x — Local models match frontier (May 2026)
Section titled “v0.10.x — Local models match frontier (May 2026)”The biggest release since v0.9 — 0.10.0 through 0.10.6, shipped over four weeks. The headline: local Ollama models now hit 91–94% on the same task suite as paid frontier APIs, thanks to a closed-loop healing pipeline and adaptive tool-calling. Read the full v0.10.0 changelog for engineering detail.
What you gain
Section titled “What you gain”Local models that actually work
Section titled “Local models that actually work”- Healing Pipeline — 4-stage closed-loop recovery on every tool call (tool-name fuzzy match → parameter-name aliasing → path resolution → type coercion). 86.7% recovery rate, +80pp accuracy, 90% cheaper than LLM reprompt. Ships on by default — see LLM Providers and Resilience.
- Adaptive tool calling — Each model gets fingerprinted on first run; native FC capable models route through the JSON path, weaker ones through a 3-tier text-parse cascade (XML → JSON → pseudo-code). The framework learns each model’s dialect after 5 runs and stops asking it to do things it can’t.
- Calibration system — Per-model observations (parallel-call capability, classifier reliability, tool-call dialect) adapt empirically. Auto-enabled when
.withReasoning()is on. - Frontier benchmark: 100% on
ra-fullverified acrossclaude-sonnet-4-6,claude-haiku-4-5,gpt-4o-mini,gemini-2.5-pro. Bare LLM only reaches 85% on the same suite. - Local benchmark: 91–94% on
ra-fullforgemma4:e4b(4 GB) andcogito:14b(9 GB) — tied withgemini-2.5-flashandgpt-4o-minion the same 35-task suite.
Long agent runs stay cheap
Section titled “Long agent runs stay cheap”- Three-stage context curation — Tool results get compressed and stashed → curator renders only what’s needed → optional reactive trim. 60.7% context reduction, 38.6% token savings, 0.16 ms overhead per step. See Intelligent Context Synthesis.
- Reactive Intelligence dispatcher — 6 corrective interventions fire automatically when an agent shows entropy signs (early-stop, temperature adjust, strategy switch, context compress, tool inject, skill activate). Suppression gates prevent runaway dispatch. See Reactive Intelligence.
Production safety hardened
Section titled “Production safety hardened”@reactive-agents/diagnose— Standalone npm package detects system-prompt, API-key, credential, and internal-instruction leaks in any output. 100% true positive, 0% false positive, 0.02 ms latency. 25 regex patterns + 4 FP filters.- Single-owner termination — All 12 phases route stop decisions through one arbitrator. CI lint guard prevents future bypass paths. Agents always finish cleanly, never get stuck.
Better runtime + tooling
Section titled “Better runtime + tooling”@reactive-agents/cortex— Cortex Studio is now installable from npm:bunx @reactive-agents/cortexorrax cortexlaunches the live agent canvas, debrief UI, and visual builder. See Cortex.- Gateway chat mode — Per-sender SQLite session history, episodic context injection, daily compaction. Set
channels.mode: 'chat'for conversational webhooks; keep'task'for one-shot triggers. See Gateway and Messaging Channels. - Composable kernel architecture — Internal
kernel/reorganized by capability (act/·attend/·comprehend/·decide/·reason/·reflect/·sense/·verify/+loop/+state/). Doesn’t change the public API; makes contributing to the framework easier. See Composable Kernel. - 5,294 tests across 741 files — verified by
bun teston every PR.
Patch releases
Section titled “Patch releases”| Version | Highlights |
|---|---|
0.10.0 | Phase 1 release — healing pipeline, calibration, diagnose, cortex npm |
0.10.1–0.10.2 | Documentation polish, version drift fixes across 28 packages |
0.10.3 | Coordinated package alignment, npm publish drift CI guard |
0.10.4 | Coordinated changeset release (single source of truth) |
0.10.5–0.10.6 | Static-asset serving in Cortex server, README + cookbook freshness |
Breaking changes
Section titled “Breaking changes”None. All existing ReactiveAgents.create().with*() builder chains keep working unchanged. New calibration fields are forward-compatible — existing ~/.reactive-agents/observations/ files decode cleanly.
v0.9.x — MCP Production Hardening + Pre-v0.10 Polish
Section titled “v0.9.x — MCP Production Hardening + Pre-v0.10 Polish”- MCP client rewritten on
@modelcontextprotocol/sdk— smart auto-detection between stdio and HTTP-only containers, two-phase docker lifecycle — see Orchestration - Composable kernel architecture (initial) —
react-kernel.tsreduced from ~1,700 to ~197 lines viamakeKernel({ phases })factory — see Composable Kernel - Permanently-failed required tools fix — tools that always error no longer cause loop-until-maxIterations — see Harness Control Flow
- Cortex MCP CRUD + JSON import — import Cursor/Claude-style MCP configs directly into Cortex — see Cortex
- StatusRenderer TUI — live terminal display with collapsible think panel (
tkey toggles),mode: 'stream' | 'status' - 3 new terminal tools —
git-cli,gh-cli, andgws-cliare now built-in - Web-search provider Serper.dev — third web-search backend alongside Tavily
crypto-pricebuilt-in tool — CoinGecko price lookup, no API key required- Observability on by default — minimal verbosity is now enabled out of the box
- Sub-agent
maxIterationsfully honored — the silent cap of 3 has been removed
v0.9.0 — MCP Production Hardening
Section titled “v0.9.0 — MCP Production Hardening”- MCP client rewritten on
@modelcontextprotocol/sdk— smart auto-detection between stdio and HTTP-only containers, two-phase docker lifecycle — see Orchestration - Composable kernel architecture —
react-kernel.tsreduced from ~1,700 to ~197 lines viamakeKernel({ phases })factory; phases are now individually swappable — see Composable Kernel - Permanently-failed required tools fix — tools that always error no longer cause loop-until-maxIterations; framework detects and stops early — see Harness Control Flow
- Cortex MCP CRUD + JSON import — import Cursor/Claude-style MCP configs directly into Cortex — see Cortex
effectmoved topeerDependencies— addeffectexplicitly if you import from it directly — see Installation
v0.8.5 — Native FC Hardening + Web Framework Adapters
Section titled “v0.8.5 — Native FC Hardening + Web Framework Adapters”- React, Vue, and Svelte adapters —
useAgentStream()anduseAgent()hooks/composables/stores for all three frameworks, consuming SSE endpoints — see Web Integration and Streaming - 7-hook provider adapter system —
taskFraming,toolGuidance,errorRecovery,synthesisPrompt,qualityCheck,continuationHint,systemPromptPatchfully wired — see Reactive Intelligence - Dynamic stopping (3-layer) — novelty signal (Jaccard overlap), budget exhaustion phase transition, and per-tool call cap (
maxCallsPerTool) — see Harness Control Flow - Full prompt observability —
logModelIO: truelogs the complete FC conversation thread with no truncation — see Observability - Actionable failure messages — loop detection, required-tools, and stall detection all emit
Fix:suggestions with specific builder options — see Troubleshooting
v0.8.0 — Reactive Intelligence Layer
Section titled “v0.8.0 — Reactive Intelligence Layer”- Entropy-aware intelligence pipeline — 5-source composite entropy sensor, trajectory classifier, and reactive controller that takes corrective action automatically — see Reactive Intelligence
- Thompson Sampling strategy learner — SQLite-backed bandit learns which reasoning strategy wins per task category across runs — see Reactive Intelligence
- Builder hardening —
withStrictValidation(),withTimeout(),withRetryPolicy(),withFallbacks(),withHealthCheck(), andwithErrorHandler()— see Builder API - Automatic strategy switching — when entropy analysis detects a stuck loop, the agent switches reasoning strategy without user intervention — see Choosing Strategies
- Observability dashboard upgrade — chalk/boxen terminal UI with entropy grade (A–F), sparklines, and entropy-informed alerts — see Observability
v0.5.0 — A2A Protocol + Observability Foundation
Section titled “v0.5.0 — A2A Protocol + Observability Foundation”- Full A2A (Agent-to-Agent) protocol — JSON-RPC 2.0 server, streaming SSE, client, discovery, and capability matching based on Google’s A2A spec — see A2A Protocol
- Agent-as-tool pattern — wrap any local or remote A2A agent as a callable tool with
createAgentTool()/createRemoteAgentTool()— see Sub-agents - Live observability streaming —
withObservability({ live: true, verbosity })writes structured phase logs to stdout as each step fires — see Observability rax serve— expose any agent as an A2A-compliant HTTP server with a single CLI command — see CLI- EventBus reasoning events — all 5 strategies publish
ReasoningStepCompleted; subscribe withagent.on()for custom monitoring — see Observability