Skip to content

Reactive Agents

A transparent agent harness for TypeScript — every prompt, tool call, and reasoning step is a typed event you can subscribe to. 12-phase pipeline with before/after/on-error hooks at every stage; raw provider clients ship standalone. MCP-native, type-safe with Effect-TS. The healing pipeline recovers 86.7% of tool-call errors so local Ollama (4B+) runs the same code as Claude, GPT, and Gemini.
40Packages & Apps
5,294Tests
6LLM Providers
7Reasoning Strategies
12Execution Phases
Terminal window
bun add reactive-agents
import { ReactiveAgents } from 'reactive-agents'
const agent = await ReactiveAgents.create()
.withProvider('anthropic')
.withReasoning() // ReAct loop: Think → Act → Observe
.withTools() // Built-in: web-search, file-read, code-execute
.withObservability()
.build()
const result = await agent.run('Find the top 3 TypeScript testing frameworks')
console.log(result.output)

One package. Composable layers. Enable exactly what you need — skip everything you don’t.

rax demo
$
 
 

Stay in the loop

Get notified when new releases ship. Reactive Agents is under active development — new strategies, adapters, and integrations land regularly. No spam. One-click unsubscribe.

Why use a framework? Why not just write the loop?

Section titled “Why use a framework? Why not just write the loop?”

You can. An agent is a loop — LLM → tool → observe → repeat. Hand-rolling works fine for prototypes, and we’d never tell you otherwise.

It breaks when reality arrives. On our 35-task benchmark, a bare ReAct loop (LLM in a while with tools) tops out at 85%. The same models inside this harness hit 100%. That gap is everything that isn’t the loop:

  • Tool calls returning malformed JSON, wrong types, or hallucinated tool names
  • Loops that don’t terminate — or terminate too early
  • Context that overflows mid-run; memory that leaks between runs
  • Local models dropping reasoning tags, repeating themselves, or refusing structured output
  • Provider-specific streaming quirks; path resolution; type coercion
  • No clean overrides, hooks, or escape hatches when your edge case shows up

This is harness engineering, and there are three honest paths:

Build it yourself

Doable. It’s also months of work and a permanent maintenance tax — every new provider, every new model quirk, every edge case you missed in v1 is yours to chase. Most teams underestimate this by 3×.

Use a black-box harness

Fast to start. Opaque to debug, audit, or override. When the magic breaks at 2am, you’re reading framework internals — without source-level control over the parts that actually matter to your agent.

Use a transparent harness ← Reactive Agents

Every phase emits typed events. The 12-phase pipeline exposes before/after/on-error hooks; system prompts are readable templates, not buried strings; raw provider clients ship standalone so you can skip the harness entirely. Components like the healing pipeline, context curator, and arbitrator are exported and inspectable today — custom-replacement surfaces land progressively (see stability tiers). No hidden prompts, no proprietary loop.

If you’re going to spend the time anyway, spend it on your agent’s logic — not on rebuilding tool-call recovery, context curation, and termination oracles for the third time this year.

Built-in capabilities measured on real workloads — no extra wiring required.

Tool calls that recover themselves

+80pp accuracy

fires on every tool call
  • Recovers from86.7%
  • Cheaper than LLM reprompt10×
  • Lifts local models most4B+ Ollama

Long runs stay cheap

38.6% tokens saved

runs every iteration
  • Context compression60.7%
  • Per-step overhead0.16ms
  • Aggressive mode44.1%

Won’t leak your secrets

100% catch rate

scans every output
  • False positives0%
  • Detection latency0.02ms
  • Leak categories4 types

Always finishes, never stuck

12/12 phases

single-owner termination
  • Stop paths covered100%
  • Loop detector + iter capbuilt-in
  • Enforced byCI

Frontier Benchmark ra-full · 4 frontier models · Apr 30 2026 (W21)

claude-sonnet-4-6100%
claude-haiku-4-5100%
gpt-4o-mini100%
gemini-2.5-pro100%
bare LLM (no harness)85%

Local Model Benchmark ra-full · same harness · 35-task suite (Apr 7 2026 baseline)

gemma4:e4b (local · 4 GB)

94%

gemini-2.5-flash (frontier ref)

94%

cogito:14b (local · 9 GB)

91%

gpt-4o-mini (frontier ref)

91%

Local models tied with frontier on the same 35-task harness. The Healing Pipeline closes the gap that bare prompting can’t: tool-call recovery for 4B+ Ollama lifts accuracy by +80pp on FC-heavy tasks (6.7% → 86.7%). Same agent code, same builder chain — just .withProvider(“ollama”).

Type-Safe from End to End

Zero any in framework code. Every agent, tool, memory entry, and LLM call is validated by Effect-TS schemas. Failures are typed tagged errors, not exceptions. 5,294 tests keep every service boundary honest on every PR.

Composable Layer Architecture

30 packages, 13 capability layers. Each is an independent Effect Layer with explicit dependencies. Memory without guardrails? Reasoning without cost tracking? Just stream tokens? Pick exactly what you need — no hidden coupling, no wasted resources.

Observable Execution Engine

12-phase deterministic lifecycle with before / after / on-error hooks per phase. Every phase emits spans, metrics, and EventBus events. You see what your agent decided, why, in what order, at what cost — no manual instrumentation required.

6 Reasoning Strategies

ReAct, Reflexion, Plan-Execute, Tree-of-Thought, Adaptive, Code-Action (@experimental) — plus automatic strategy switching when entropy detects the agent is stuck. Register your own strategies. ToT outer-loop early-stop and 8-action reactive controller ship out of the box.

Local Models That Actually Work

+80pp accuracy on Ollama 4B+ vs. naive prompting. The 4-stage Healing Pipeline recovers from 86.7% of tool-call errors — 90% cheaper than LLM reprompt. Model-adaptive context tunes prompts and compaction per tier. Same code, frontier-to-local.

MCP-Native Tool I/O

Connect any Model Context Protocol server — local (stdio) or remote (streamable-http). The 9,400+ public MCP servers (filesystem, GitHub, Slack, browsers, databases) plug in alongside your custom tools via .withMCP(). The protocol is the integration layer; we don’t reinvent it.

Skills as a Primitive

First-class SKILL.md lifecycle — load, activate, and hand-off built into the kernel, not bolted on. Compatible with the emerging cross-tool skill format used by Claude Code, Codex, and Cursor. Browse the Skills guide →

Frontier-Verified

100% pass on ra-full across claude-sonnet-4-6, claude-haiku-4-5, gpt-4o-mini, and gemini-2.5-pro (Apr 30 2026). Bare LLM only achieves 85% on the same harness — a measurable lift from the framework, not the model.

Great DX

60 seconds to first agent. Progressive disclosure — start with 3 lines, add reasoning, memory, guardrails, and observability as you need them. The builder API reads like a sentence. rax CLI scaffolds, runs, and inspects.

Cortex Local Studio

bunx @reactive-agents/cortex for a full local studio: Beacon (live agent canvas with entropy charts), Thalamus (visual agent builder), Lab (debrief UI), and Living Skills views. One flag away from any agent: .withCortex().

vs. LangChain / LlamaIndex

Python-first, dynamically typed, monolithic. Reactive Agents is TypeScript-native with zero any, fully modular layers, and built-in observability. You see every decision — not just the final output. Side-by-side migration guide included.

vs. Vercel AI SDK

Great for streaming and tool calling, but stops there. Reactive Agents adds 5 reasoning strategies, persistent 4-tier memory, guardrails, verification, cost routing, and a 12-phase execution engine with full observability — same TypeScript ergonomics.

vs. AutoGen / CrewAI

Multi-agent-first frameworks. Reactive Agents takes the Cognition-aligned posture: single-threaded writes, sub-agent delegation only when it pays for itself. Type-safe, composable, with the healing pipeline that lifts local-model accuracy by +80pp — and A2A (JSON-RPC + SSE) ready when you actually need fan-out.

vs. Building From Scratch

40 production-ready packages, 5,294 tests, 12-phase engine. Memory, reasoning, tools, A2A, gateway, reactive intelligence, safety, cost, identity, orchestration — all composable, all opt-in. Focus on your agent’s logic, not infrastructure.

// Token-by-token streaming via AsyncGenerator
for await (const event of agent.runStream("Write a haiku about TypeScript")) {
if (event._tag === "TextDelta") process.stdout.write(event.text);
if (event._tag === "IterationProgress") console.log(`Step ${event.iteration}/${event.maxIterations}`);
if (event._tag === "StreamCompleted") console.log("\nDone!");
}
// One-liner SSE endpoint
Bun.serve({ fetch: (req) => AgentStream.toSSE(agent.runStream("Hello")) });

Fluent Builder API

Chain capabilities like a sentence — readable and naturally discoverable

🔌

6 LLM Providers

Anthropic, OpenAI, Gemini, Ollama, LiteLLM (40+ models) — one unified interface

🧠

5 Reasoning Strategies

ReAct, Reflexion, Plan-Execute, Tree-of-Thought, Adaptive

🔧

Built-in Tool Suite

web-search, file-read, code-execute, http-get, calculator

💾

4-Tier Memory

Working, Semantic, Episodic, Procedural — all composable layers

🌐

Web Framework Hooks

React, Vue & Svelte — useAgentStream, useAgent, createAgentStream out of the box

🔒

Effect-TS Type Safety

RuntimeErrors union, typed hooks, zero runtime surprises

builder api
const agent = await ReactiveAgents
  .create()
  .withProvider("anthropic")
  .withReasoning()      // ReAct
  .withTools()           // Built-ins
  .withMemory({
    tier: "enhanced"
  })
  .withObservability()
  .build();

const result = await
  agent.run(task);
// .output .metadata .debrief
🧠

5 Entropy Sources

Token, structural, semantic, behavioral, context pressure — real-time reasoning quality

Early Stop

Detect convergence and stop early — save tokens and time automatically

🔄

Strategy Switching

Auto-switch reasoning strategy when entropy shows the agent is stuck

📊

Trajectory Analysis

Track entropy over time: converging, flat, diverging, oscillating

🎯

Per-Model Calibration

Conformal thresholds adapt to each model's characteristics over time

📈

Local Learning

Thompson Sampling bandit learns optimal strategies per task category

reactive intelligence
.withReactiveIntelligence({
  controller: {
    earlyStop: true,
    contextCompression: true,
    strategySwitch: true,
  },
})

// Dashboard output:
🧠 Reasoning Signal
├─ Grade: B  Signal: converging 
├─ Trace: ████▓▒░ 0.650.25
└─ Tip: Enable earlyStop
📊

12-Phase Execution Engine

bootstrap → guardrail → think → act → observe → complete

🔔

EventBus Auto-Wiring

Zero manual instrumentation — MetricsCollector subscribes automatically

Live Log Streaming

Real-time phase events at 4 verbosity levels: minimal → debug

🔍

Distributed Tracing

OpenTelemetry spans with correlation IDs across every phase

💡

Smart Alerts

Bottleneck detection, budget warnings, optimization suggestions

📈

Cost Metrics

Token count and USD estimate tracked and reported per run

dashboard output
┌──────────────────────────────┐
 ✅ Execution Summary         
├──────────────────────────────┤
 Duration: 13.9s  Steps: 7    
 Tokens:  1,963  Cost: ~$0.003
└──────────────────────────────┘

📊 Execution Timeline
├─ [bootstrap]   100ms 
├─ [guardrail]    50ms 
├─ [think]    10,001ms ⚠️ 7 iter
├─ [act]       1,000ms  2 tools
└─ [complete]     28ms 
🛡️

Prompt Injection Detection

Blocks injection attacks with configurable threshold scoring

🔏

PII & Toxicity Scrubbing

Auto-detects sensitive data and toxic content before LLM ingestion

Kill Switch

Pause, resume, or terminate any running agent with zero state corruption

📋

Behavioral Contracts

Tool deny lists, iteration caps, and output pattern enforcement

💰

Budget Enforcement

Per-request, daily, monthly cost caps — auto-halts before overspend

Approval Gates

Human-in-the-loop confirmation for high-risk tool execution

safety config
.withGuardrails({
  injectionThreshold: 0.8,
  piiThreshold:       0.9,
  toxicityThreshold:  0.7,
})
.withKillSwitch()
.withBehavioralContracts({
  toolDenyList: ["shell-execute"],
  maxIterations: 20,
})
.withCostTracking({
  budget: { perRequest: 0.10 },
})
🌊

Token Streaming

AsyncGenerator with TextDelta, IterationProgress, and SSE adapter

🤖

Persistent Gateway

24/7 agent harness with crons, webhooks, adaptive heartbeats

🔗

A2A Protocol

Agent-to-agent JSON-RPC 2.0 with SSE streaming and Agent Cards

🧪

Hallucination Detection

Semantic entropy + fact decomposition verification layer

💬

Chat Sessions

Multi-turn conversation with adaptive routing and persistent memory

🔁

Error Recovery

Retry policies, global error handler, clean FiberFailure unwrapping

streaming
for await (const e of
  agent.runStream(task, {
    signal: ctrl.signal,
  })) {
  if (e._tag === "TextDelta")
    write(e.text);
  if (e._tag === "IterationProgress")
    log(e.iteration, e.maxIterations);
}
⚛️

React Hooks

useAgentStream + useAgent — token streaming and one-shot calls from any React component

💚

Vue Composables

useAgentStream + useAgent with reactive refs — drop into any Vue 3 component

🧡

Svelte Stores

createAgentStream writable store — reactive $agent.text, $agent.status out of the box

🌊

One-Line SSE Endpoint

AgentStream.toSSE() returns a standard Response — works with Next.js App Router, SvelteKit, Nuxt, Bun

60s to First Agent

One install, three lines, full observability dashboard — then layer in capabilities as you need them

🛠️

rax CLI + 3,472 Tests

Scaffold, run, inspect — 25 modular packages, battle-tested across 409 test files

rax cli
# scaffold a new project
$ rax init my-agent \
    --template standard

# run with cloud provider
$ rax run "Analyze codebase" \
    --provider anthropic

# run local — zero API cost
$ rax run "Summarize logs" \
    --provider ollama \
    --model qwen3:14b
🔭

Beacon Agent Grid

Live grid of all connected agents with real-time cognitive state and entropy status

📈

Entropy Signal Charts

D3-powered entropy trajectory: watch reasoning quality converge, plateau, or diverge in real time

🧵

Step-by-Step Trace Panel

Full Thought → Action → Observation breakdown per iteration, live-streamed or replayed from SQLite

📋

Debrief Summaries

Structured post-run cards: task, plan, outcome, sources, confidence score, and agent self-critique

💬

Interactive Chat

Multi-turn conversational sessions tied to agent runs — same context, persistent history

🔬

Lab: Visual Builder

Configure and launch agents without code — skills browser, tool workshop, gateway agent manager

cortex studio
# Terminal 1: start studio (from repo)
$ bun cortex
UI → http://localhost:5173

# Terminal 2: connect agent
$ rax run "Analyze codebase" \
    --provider anthropic \
    --cortex

// or in code:
.withCortex()  // one line
// URL: CORTEX_URL env → localhost:4321

Pick the path that matches where you are.

Shape any agent signal — system prompts, tool results, nudges, lifecycle events — with the declarative .compose() API. One line enables full OpenTelemetry export. Six prebuilt killswitches ship in the box.

Compose API Reference · Tag Catalog · 9 Recipes

Re-run any recorded trace deterministically with prompt or model overrides — tool results held constant. Auditable-by-demo: no other framework lets you replay a decision.

Snapshot & Replay

Terminal window
npm create reactive-agent my-agent
# or bun create reactive-agent my-agent

Interactive prompts guide you through template (minimal, with-tools, streaming), provider (anthropic, openai, google, ollama), and package manager. Pass --yes for zero-prompt CI scaffolding.

create-reactive-agent

12-phase lifecycle · phases marked run inside the loop body

01

bootstrap

Load context, semantic + episodic memory

02

guardrail

Block injection, PII, toxicity pre-LLM

03

cost-route

Pick cheapest capable model tier

04

strategy-select

ReAct · Reflexion · Plan-Execute · ToT

05

think

LLM reasoning step (one of N iterations)

06

act

Tool execution + healing pipeline

07

observe

Append tool results, curate context

08

verify

Entropy, fact decomposition, NLI check

09

memory-flush

Persist session, episodic, procedural

10

cost-track

Record spend, enforce budget

11

audit

Emit audit events for compliance

12

complete

Build AgentResult with full metadata

See full installation guide →