Agents SDK: Enhanced Browser Automation & Code Recovery

The latest release of the Agents SDK makes it easier to build agents that can safely interact with real systems and keep working through interruptions.

Agents can now browse websites through Browser Run, write code against external tools through Codemode, use client-provided tools when delegating to Think sub-agents, and recover more reliably from deploys, Durable Object evictions, and connection churn.

Safer browser automation

Agents can now use Browser Run through a single durable browser_execute tool. Instead of choosing from a fixed list of actions, the model writes code against the Chrome DevTools Protocol (CDP) and can inspect pages, capture screenshots, read rendered content, debug frontend behavior, and interact with live browser sessions.

JavaScript

const browserTools = createBrowserTools({
  ctx: this.ctx,
  browser: this.env.BROWSER,
  loader: this.env.LOADER,
  session: { mode: "dynamic" },
});

TypeScript

const browserTools = createBrowserTools({
  ctx: this.ctx,
  browser: this.env.BROWSER,
  loader: this.env.LOADER,
  session: { mode: "dynamic" },
});

Browser sessions can be one-time, reused, or promoted from one-time to persistent during a run. This is useful when an agent needs a human to log in, complete MFA, or approve a sensitive action. The run can pause, keep the same tabs and cookies, and resume after approval.

The browser tools also add Live View URLs, optional session recording, and quick actions such as browser_markdown, browser_extract, browser_links, and browser_scrape for one-shot browsing tasks.

Resumable code execution with approvals

Codemode now uses createCodemodeRuntime, connectors, and a durable execution log. This lets you give a model one codemode tool instead of a large prompt full of tool definitions. The model can discover the capabilities it needs, write code against typed globals, and reuse saved snippets.

JavaScript

const runtime = createCodemodeRuntime({
  ctx: this.ctx,
  executor: new DynamicWorkerExecutor({ loader: this.env.LOADER }),
  connectors: [new GithubConnector(this.ctx, this.env, connection)],
});

const result = streamText({
  model,
  messages,
  tools: { codemode: runtime.tool() },
});

TypeScript

const runtime = createCodemodeRuntime({
  ctx: this.ctx,
  executor: new DynamicWorkerExecutor({ loader: this.env.LOADER }),
  connectors: [new GithubConnector(this.ctx, this.env, connection)],
});

const result = streamText({
  model,
  messages,
  tools: { codemode: runtime.tool() },
});

When the code reaches an approval-gated action, the runtime pauses execution and returns a pending approval. After approval, completed calls replay from the durable log, the approved action runs, and the same code continues. This makes it practical to build agents that create issues, update external systems, or perform other side effects without custom pause-and-resume logic for every tool.

Better Think delegation

Think sub-agents can now use client-defined tools over the RPC chat() path. A parent agent can pass tool schemas with clientTools and resolve tool calls through onClientToolCall. This lets delegated agents use caller-provided capabilities without requiring a browser WebSocket.

JavaScript

await child.chat(message, callback, {
  signal,
  clientTools: [
    {
      name: "get_user_timezone",
      description: "Get the caller's timezone",
      parameters: { type: "object" },
    },
  ],
  onClientToolCall: async ({ toolName, input }) => {
    return runClientTool(toolName, input);
  },
});

TypeScript

await child.chat(message, callback, {
  signal,
  clientTools: [
    {
      name: "get_user_timezone",
      description: "Get the caller's timezone",
      parameters: { type: "object" },
    },
  ],
  onClientToolCall: async ({ toolName, input }) => {
    return runClientTool(toolName, input);
  },
});

Think Workflows also improve step.prompt(). A prompt step now runs a full agentic turn before returning structured output, so the agent can call tools before producing the typed result. This makes Workflow steps more useful for durable triage, research, and approval flows.

The unified Think execute tool can also include cdp.* browser capabilities alongside state.* and tools.* when Browser Run is bound.

Voice output device selection

Voice clients can route assistant audio to a specific output device. Use outputDeviceId with useVoiceAgent, or call client.setOutputDevice() from the framework-agnostic client.

JavaScript

const voice = useVoiceAgent({
  agent: "MyVoiceAgent",
  outputDeviceId: selectedSpeakerId,
});

TypeScript

const voice = useVoiceAgent({
  agent: "MyVoiceAgent",
  outputDeviceId: selectedSpeakerId,
});

Browsers without speaker-selection support continue playing through the default output device and report a non-fatal outputDeviceError.

Reliability fixes

This release includes several fixes for production agents:

useAgent and AgentClient handle WebSocket replacement more reliably during reconnects and configuration changes.
Chat stream replay is more reliable after reconnects, deploys, and provider errors.
Fiber recovery continues across multi-pass scans and backs off when recovery hooks keep failing.
Agent teardown continues even when the request that started teardown is canceled.
Large session histories use byte-budgeted reads to reduce memory pressure during startup.

Upgrade

To update to the latest version:

npm i agents@latest @cloudflare/think@latest @cloudflare/codemode@latest @cloudflare/ai-chat@latest @cloudflare/voice@latest

yarn add agents@latest @cloudflare/think@latest @cloudflare/codemode@latest @cloudflare/ai-chat@latest @cloudflare/voice@latest

pnpm add agents@latest @cloudflare/think@latest @cloudflare/codemode@latest @cloudflare/ai-chat@latest @cloudflare/voice@latest

bun add agents@latest @cloudflare/think@latest @cloudflare/codemode@latest @cloudflare/ai-chat@latest @cloudflare/voice@latest

Refer to the Codemode documentation, Browser tools documentation, Think tools documentation, and Voice documentation for more information.

Source: Cloudflare

Latest Posts

Share This Update

Agents, Workers – Agents SDK improves browser automation, code execution, and recovery

Safer browser automation

Resumable code execution with approvals

Better Think delegation

Voice output device selection

Reliability fixes

Upgrade

Latest Posts

Access – Static OAuth client credentials for MCP server portals

Workers, Durable Objects – Inspect Worker startup performance with Wrangler

Stream – Rotate Stream broadcast keys for live inputs

Modern Workspace Pro