Build it yourself
Doable. It’s also months of work and a permanent maintenance tax — every new provider, every new model quirk, every edge case you missed in v1 is yours to chase. Most teams underestimate this by 3×.
bun add reactive-agentsimport { ReactiveAgents } from 'reactive-agents'
const agent = await ReactiveAgents.create() .withProvider('anthropic') .withReasoning() // ReAct loop: Think → Act → Observe .withTools() // Built-in: web-search, file-read, code-execute .withObservability() .build()
const result = await agent.run('Find the top 3 TypeScript testing frameworks')console.log(result.output)One package. Composable layers. Enable exactly what you need — skip everything you don’t.
Get notified when new releases ship. Reactive Agents is under active development — new strategies, adapters, and integrations land regularly. No spam. One-click unsubscribe.
You can. An agent is a loop — LLM → tool → observe → repeat. Hand-rolling works fine for prototypes, and we’d never tell you otherwise.
It breaks when reality arrives. On our 35-task benchmark, a bare ReAct loop (LLM in a while with tools) tops out at 85%. The same models inside this harness hit 100%. That gap is everything that isn’t the loop:
This is harness engineering, and there are three honest paths:
Build it yourself
Doable. It’s also months of work and a permanent maintenance tax — every new provider, every new model quirk, every edge case you missed in v1 is yours to chase. Most teams underestimate this by 3×.
Use a black-box harness
Fast to start. Opaque to debug, audit, or override. When the magic breaks at 2am, you’re reading framework internals — without source-level control over the parts that actually matter to your agent.
Use a transparent harness ← Reactive Agents
Every phase emits typed events. The 12-phase pipeline exposes before/after/on-error hooks; system prompts are readable templates, not buried strings; raw provider clients ship standalone so you can skip the harness entirely. Components like the healing pipeline, context curator, and arbitrator are exported and inspectable today — custom-replacement surfaces land progressively (see stability tiers). No hidden prompts, no proprietary loop.
If you’re going to spend the time anyway, spend it on your agent’s logic — not on rebuilding tool-call recovery, context curation, and termination oracles for the third time this year.
Built-in capabilities measured on real workloads — no extra wiring required.
Tool calls that recover themselves
+80pp accuracy
fires on every tool callLong runs stay cheap
38.6% tokens saved
runs every iterationWon’t leak your secrets
100% catch rate
scans every outputAlways finishes, never stuck
12/12 phases
single-owner terminationFrontier Benchmark ra-full · 4 frontier models · Apr 30 2026 (W21)
Local Model Benchmark ra-full · same harness · 35-task suite (Apr 7 2026 baseline)
Local models tied with frontier on the same 35-task harness. The Healing
Pipeline closes the gap that bare prompting can’t: tool-call recovery for
4B+ Ollama lifts accuracy by +80pp on FC-heavy tasks (6.7% → 86.7%).
Same agent code, same builder chain — just
.withProvider(“ollama”).
Type-Safe from End to End
Zero any in framework code. Every agent, tool, memory entry, and
LLM call is validated by Effect-TS schemas. Failures are typed tagged
errors, not exceptions. 5,294 tests keep
every service boundary honest on every PR.
Composable Layer Architecture
30 packages, 13 capability layers. Each is an independent Effect
Layer with explicit dependencies. Memory without guardrails? Reasoning
without cost tracking? Just stream tokens? Pick exactly what you need —
no hidden coupling, no wasted resources.
Observable Execution Engine
12-phase deterministic lifecycle with before / after /
on-error hooks per phase. Every phase emits spans, metrics, and
EventBus events. You see what your agent decided, why, in what order, at
what cost — no manual instrumentation required.
6 Reasoning Strategies
ReAct, Reflexion, Plan-Execute, Tree-of-Thought, Adaptive, Code-Action (@experimental) — plus automatic strategy switching when entropy detects the agent is stuck. Register your own strategies. ToT outer-loop early-stop and 8-action reactive controller ship out of the box.
Local Models That Actually Work
+80pp accuracy on Ollama 4B+ vs. naive prompting. The 4-stage Healing Pipeline recovers from 86.7% of tool-call errors — 90% cheaper than LLM reprompt. Model-adaptive context tunes prompts and compaction per tier. Same code, frontier-to-local.
MCP-Native Tool I/O
Connect any Model Context Protocol server — local (stdio) or remote (streamable-http). The 9,400+ public MCP servers (filesystem, GitHub, Slack,
browsers, databases) plug in alongside your custom tools via .withMCP(). The protocol
is the integration layer; we don’t reinvent it.
Skills as a Primitive
First-class SKILL.md lifecycle — load, activate, and hand-off built into the kernel, not bolted on. Compatible with the emerging cross-tool skill format used by Claude Code, Codex, and Cursor. Browse the Skills guide →
Frontier-Verified
100% pass on ra-full across claude-sonnet-4-6,
claude-haiku-4-5, gpt-4o-mini, and gemini-2.5-pro (Apr 30 2026).
Bare LLM only achieves 85% on the same harness — a measurable lift from
the framework, not the model.
Great DX
60 seconds to first agent. Progressive disclosure — start with 3
lines, add reasoning, memory, guardrails, and observability as you need
them. The builder API reads like a sentence. rax CLI scaffolds, runs,
and inspects.
Cortex Local Studio
bunx @reactive-agents/cortex for a full local studio: Beacon (live
agent canvas with entropy charts), Thalamus (visual agent builder), Lab
(debrief UI), and Living Skills views. One flag away from any agent:
.withCortex().
vs. LangChain / LlamaIndex
Python-first, dynamically typed, monolithic. Reactive Agents is
TypeScript-native with zero any, fully modular layers, and
built-in observability. You see every decision — not just the final
output. Side-by-side migration guide included.
vs. Vercel AI SDK
Great for streaming and tool calling, but stops there. Reactive Agents adds 5 reasoning strategies, persistent 4-tier memory, guardrails, verification, cost routing, and a 12-phase execution engine with full observability — same TypeScript ergonomics.
vs. AutoGen / CrewAI
Multi-agent-first frameworks. Reactive Agents takes the Cognition-aligned posture: single-threaded writes, sub-agent delegation only when it pays for itself. Type-safe, composable, with the healing pipeline that lifts local-model accuracy by +80pp — and A2A (JSON-RPC + SSE) ready when you actually need fan-out.
vs. Building From Scratch
40 production-ready packages, 5,294 tests, 12-phase engine. Memory, reasoning, tools, A2A, gateway, reactive intelligence, safety, cost, identity, orchestration — all composable, all opt-in. Focus on your agent’s logic, not infrastructure.
// Token-by-token streaming via AsyncGeneratorfor await (const event of agent.runStream("Write a haiku about TypeScript")) { if (event._tag === "TextDelta") process.stdout.write(event.text); if (event._tag === "IterationProgress") console.log(`Step ${event.iteration}/${event.maxIterations}`); if (event._tag === "StreamCompleted") console.log("\nDone!");}
// One-liner SSE endpointBun.serve({ fetch: (req) => AgentStream.toSSE(agent.runStream("Hello")) });// Multi-turn conversation with memoryconst session = agent.session();
await session.chat("What's the capital of France?");// → "Paris is the capital of France."
await session.chat("What's the population?");// → "Paris has approximately 2.1 million residents..."// (remembers context from previous turn)// Autonomous agent that runs 24/7const agent = await ReactiveAgents.create() .withProvider("anthropic") .withReasoning() .withTools() .withGateway({ heartbeat: { intervalMs: 3_600_000, policy: "adaptive" }, crons: [{ schedule: "0 9 * * MON", instruction: "Weekly report" }], webhooks: [{ path: "/github", adapter: "github" }], policies: { dailyTokenBudget: 50_000 }, }) .build();
agent.start(); // Runs forever, Ctrl+C to stopFluent Builder API
Chain capabilities like a sentence — readable and naturally discoverable
6 LLM Providers
Anthropic, OpenAI, Gemini, Ollama, LiteLLM (40+ models) — one unified interface
5 Reasoning Strategies
ReAct, Reflexion, Plan-Execute, Tree-of-Thought, Adaptive
Built-in Tool Suite
web-search, file-read, code-execute, http-get, calculator
4-Tier Memory
Working, Semantic, Episodic, Procedural — all composable layers
Web Framework Hooks
React, Vue & Svelte — useAgentStream, useAgent, createAgentStream out of the box
Effect-TS Type Safety
RuntimeErrors union, typed hooks, zero runtime surprises
const agent = await ReactiveAgents .create() .withProvider("anthropic") .withReasoning() // ReAct .withTools() // Built-ins .withMemory({ tier: "enhanced" }) .withObservability() .build(); const result = await agent.run(task); // .output .metadata .debrief
5 Entropy Sources
Token, structural, semantic, behavioral, context pressure — real-time reasoning quality
Early Stop
Detect convergence and stop early — save tokens and time automatically
Strategy Switching
Auto-switch reasoning strategy when entropy shows the agent is stuck
Trajectory Analysis
Track entropy over time: converging, flat, diverging, oscillating
Per-Model Calibration
Conformal thresholds adapt to each model's characteristics over time
Local Learning
Thompson Sampling bandit learns optimal strategies per task category
.withReactiveIntelligence({ controller: { earlyStop: true, contextCompression: true, strategySwitch: true, }, }) // Dashboard output: 🧠 Reasoning Signal ├─ Grade: B Signal: converging ↘ ├─ Trace: ████▓▒░ 0.65→0.25 └─ Tip: Enable earlyStop
12-Phase Execution Engine
bootstrap → guardrail → think → act → observe → complete
EventBus Auto-Wiring
Zero manual instrumentation — MetricsCollector subscribes automatically
Live Log Streaming
Real-time phase events at 4 verbosity levels: minimal → debug
Distributed Tracing
OpenTelemetry spans with correlation IDs across every phase
Smart Alerts
Bottleneck detection, budget warnings, optimization suggestions
Cost Metrics
Token count and USD estimate tracked and reported per run
┌──────────────────────────────┐ │ ✅ Execution Summary │ ├──────────────────────────────┤ │ Duration: 13.9s Steps: 7 │ │ Tokens: 1,963 Cost: ~$0.003│ └──────────────────────────────┘ 📊 Execution Timeline ├─ [bootstrap] 100ms ✅ ├─ [guardrail] 50ms ✅ ├─ [think] 10,001ms ⚠️ 7 iter ├─ [act] 1,000ms ✅ 2 tools └─ [complete] 28ms ✅
Prompt Injection Detection
Blocks injection attacks with configurable threshold scoring
PII & Toxicity Scrubbing
Auto-detects sensitive data and toxic content before LLM ingestion
Kill Switch
Pause, resume, or terminate any running agent with zero state corruption
Behavioral Contracts
Tool deny lists, iteration caps, and output pattern enforcement
Budget Enforcement
Per-request, daily, monthly cost caps — auto-halts before overspend
Approval Gates
Human-in-the-loop confirmation for high-risk tool execution
.withGuardrails({ injectionThreshold: 0.8, piiThreshold: 0.9, toxicityThreshold: 0.7, }) .withKillSwitch() .withBehavioralContracts({ toolDenyList: ["shell-execute"], maxIterations: 20, }) .withCostTracking({ budget: { perRequest: 0.10 }, })
Token Streaming
AsyncGenerator with TextDelta, IterationProgress, and SSE adapter
Persistent Gateway
24/7 agent harness with crons, webhooks, adaptive heartbeats
A2A Protocol
Agent-to-agent JSON-RPC 2.0 with SSE streaming and Agent Cards
Hallucination Detection
Semantic entropy + fact decomposition verification layer
Chat Sessions
Multi-turn conversation with adaptive routing and persistent memory
Error Recovery
Retry policies, global error handler, clean FiberFailure unwrapping
for await (const e of agent.runStream(task, { signal: ctrl.signal, })) { if (e._tag === "TextDelta") write(e.text); if (e._tag === "IterationProgress") log(e.iteration, e.maxIterations); }
React Hooks
useAgentStream + useAgent — token streaming and one-shot calls from any React component
Vue Composables
useAgentStream + useAgent with reactive refs — drop into any Vue 3 component
Svelte Stores
createAgentStream writable store — reactive $agent.text, $agent.status out of the box
One-Line SSE Endpoint
AgentStream.toSSE() returns a standard Response — works with Next.js App Router, SvelteKit, Nuxt, Bun
60s to First Agent
One install, three lines, full observability dashboard — then layer in capabilities as you need them
rax CLI + 3,472 Tests
Scaffold, run, inspect — 25 modular packages, battle-tested across 409 test files
# scaffold a new project $ rax init my-agent \ --template standard # run with cloud provider $ rax run "Analyze codebase" \ --provider anthropic # run local — zero API cost $ rax run "Summarize logs" \ --provider ollama \ --model qwen3:14b
Beacon Agent Grid
Live grid of all connected agents with real-time cognitive state and entropy status
Entropy Signal Charts
D3-powered entropy trajectory: watch reasoning quality converge, plateau, or diverge in real time
Step-by-Step Trace Panel
Full Thought → Action → Observation breakdown per iteration, live-streamed or replayed from SQLite
Debrief Summaries
Structured post-run cards: task, plan, outcome, sources, confidence score, and agent self-critique
Interactive Chat
Multi-turn conversational sessions tied to agent runs — same context, persistent history
Lab: Visual Builder
Configure and launch agents without code — skills browser, tool workshop, gateway agent manager
# Terminal 1: start studio (from repo) $ bun cortex UI → http://localhost:5173 # Terminal 2: connect agent $ rax run "Analyze codebase" \ --provider anthropic \ --cortex // or in code: .withCortex() // one line // URL: CORTEX_URL env → localhost:4321
Pick the path that matches where you are.
Shape any agent signal — system prompts, tool results, nudges, lifecycle events — with the declarative .compose() API. One line enables full OpenTelemetry export. Six prebuilt killswitches ship in the box.
Re-run any recorded trace deterministically with prompt or model overrides — tool results held constant. Auditable-by-demo: no other framework lets you replay a decision.
npm create reactive-agent my-agent# or bun create reactive-agent my-agentInteractive prompts guide you through template (minimal, with-tools, streaming), provider (anthropic, openai, google, ollama), and package manager. Pass --yes for zero-prompt CI scaffolding.
12-phase lifecycle · phases marked ↻ run inside the loop body
bootstrap
Load context, semantic + episodic memory
guardrail
Block injection, PII, toxicity pre-LLM
cost-route
Pick cheapest capable model tier
strategy-select
ReAct · Reflexion · Plan-Execute · ToT
think
LLM reasoning step (one of N iterations)
act
Tool execution + healing pipeline
observe
Append tool results, curate context
verify
Entropy, fact decomposition, NLI check
memory-flush
Persist session, episodic, procedural
cost-track
Record spend, enforce budget
audit
Emit audit events for compliance
complete
Build AgentResult with full metadata