Agents that never fail.
Policies that never bend.

The durable agent runtime that persists state, executes tools transactionally, and enforces every policy.
Reliability and trust by construction.

Golem thinker — stone figure in contemplation
Ziverge
Golem Social
Code-first

Agents are code, not prompts.

Typed agents and tools in TypeScript, Rust, Scala, or MoonBit. State persists across failures, tool calls fire exactly once, and your code harnesses the model.

orders-agent.ts
const Orders = agentDefinition('orders')
  .id({ customerId: z.string() })
  .config(z.object({ systemPrompt: z.string() }))
  .method('handle', m => m
    .input(z.object({ request: z.string(), orderId: z.string() }))
    .returns(z.object({ resolved: z.boolean() })))

export default Orders.implement({
  init: () => ({ history: [] as Message[] }),
  methods: {
    async handle({ request, orderId }) {
      // Durable in-memory state — survives crashes, deploys, host migrations
      this.history.push({ role: 'user', content: request })

      // LLM sees full conversation; system prompt comes from typed config
      const outcome = await llm.run({
        prompt: this.config.systemPrompt, history: this.history,
        tools: [cancelOrder, changeAddress], context: { orderId },
      })
      this.history.push({ role: 'assistant', content: outcome.message })

      // Refunds aren't in the LLM's toolset — agent code gates them via HITL
      if (outcome.needsRefund) {
        const { approved } = await webhooks.awaitApproval(outcome)
        if (approved) {
          // Transactional — refund executes exactly once, even through crashes or restarts
          const result = await refundOrder({ orderId, amount: outcome.refundAmount })
          this.history.push({ role: 'tool', content: JSON.stringify(result) })
        }
      }
      return { resolved: true }
    },
  },
})

Persists state.

State changes and effects are captured automatically, without serialization, state machines, or annotations. Agents suspend for days or weeks at zero compute and zero memory cost, resuming with the same memory, locals, and call stack.

Treat memory as durable.

Executes transactionally.

Agent logic, tools, and inter-agent calls run exactly once — not "at-least-once with idempotency disclaimers." Transient failures retry without exiting your agent; any interruption — restart, redeploy, eviction, hardware fault — recovers with full state.

Ship code that runs exactly once.

Enforces every policy.

Every agent and tool runs in its own WASM sandbox — millisecond startup, megabytes of memory — with capabilities that can't be forged or leaked*. Rate, capacity, and concurrency limits are runtime-enforced; every authorization is journaled.

Turn policies into guarantees.

Why not LangChain?

LangChain leaves you the hard parts.

Tool calls firing twice. State lost mid-node. SQL checkpointers under load. These aren't problems LangChain solves — they're runtime problems. Golem solves them as runtime guarantees.

AI Framework
Golem Runtime
Why it matters
AI Framework
Agents share host resources; tenant isolation depends on developer-enforced discipline
Golem Runtime
Each agent owns its own filesystem, SQLite database, and environment
Why it matters
Cross-tenant leaks become structurally impossible
AI Framework
Durability is opt-in and coarse — recovery restarts from last boundary, losing in-flight state
Golem Runtime
Every state change captured automatically; in-flight state survives any failure or suspension
Why it matters
No state is ever lost to failure or suspension
AI Framework
Auto-retry only works for idempotent tools; the rest require developer-managed safety logic
Golem Runtime
Agent logic and tools — internal or external — execute durably with exactly-once semantics
Why it matters
Infrastructure failures never cause partial or duplicate work
AI Framework
Authority enforced by developer-written code and LLM prompts; both fail when their authors do
Golem Runtime
Capabilities bounded by the runtime — code can only do what it's granted
Why it matters
Buggy or malicious code can't exceed what it was granted
AI Framework
Waits, retries, and HITL require explicit state-machine code at framework boundaries
Golem Runtime
Any flow is just code — suspension, retries, and resumption are runtime behaviors
Why it matters
No state-machine code to write or maintain

Runtimes deliver what frameworks can't even promise.

10,000+
Active agents per node
2 ms
Agent cold start
1 MB
Min sandbox memory
0 CPU/RAM
Idle resource cost
Bring your stack

Your libraries. Our runtime.

Bring your favorite LLM SDKs, your tool libraries, your utilities — anything that's just code. They run on Golem, and your agent logic and tool primitives inherit the runtime's guarantees, without modification.

Use Golem's lightweight SDKs only when you want runtime-specific features: durability hooks, forking, rollbacks, agent and tool discovery. Frameworks that bring their own runtime aren't officially supported today.

The full package

Bundled into the runtime.

OpenTelemetry built-in
Every step traced, every metric auto-emitted
MCP server, automatic
Your agents are MCP servers out of the box
Model-agnostic
Any model via HTTP — routing stays in your code
Scheduled execution
Cron-native, with durability across runs
Webhook primitives
Incoming events like HITL become awaitable promises
Tool-calling protocols
MCP, HTTP, RPC — all exactly-once
Sandboxed by construction
Every component runs in a WASM sandbox at instance cost
Streaming, durably
WebSocket and SSE flows resume across deploys
First-class quotas
Rate, capacity, concurrency, GPU — one mechanism
Replay-driven evaluation
The oplog is your eval substrate, no separate harness
A2A protocol interop *
Peer agents across runtime boundaries
Tool middleware *
Polices and guardrails with irconglad guarantees
First-class secrets *
Opaque handles, capability-gated reveal
Per-tool capabilities *
Each tool call carries its own bounded authority
Source-available. Your cloud. Your language.

Run it where you want. Write it how you like.

BUSL-1.1 → Apache-2.0

The runtime source is auditable, the WASM components are inspectable, and the license transitions to Apache-2.0 — staying out of your way today, fully permissive tomorrow.

Your cloud.

Run Golem where you run everything else — on a laptop, in Docker, in Kubernetes, on any cloud, or on-prem.

Your language.

Same runtime, same capabilities, same guarantees, same operational behavior across supported languages.

TypeScript Strongest surface
Rust Substrate-credible
Scala Effects-friendly
MoonBit Small WASM

Built by the wizards behind ZIO — the open-source effect system running in production at companies across fintech, ad tech, and AI infrastructure for the better part of a decade.

In their own words

From teams shipping on Golem.

Crash your first agent in five minutes.
Watch it come back.

Scaffold a durable agent, run it locally, kill the process at any line, and watch it resume exactly where it stopped.

# Install: download from github.com/golemcloud/golem/releases
golem new --template ts --component-name example:counter --yes my-agent
cd my-agent && golem build
golem repl