Agents, Workers – Agents SDK improves browser automation, code execution, and recovery

Agents, Workers – Agents SDK improves browser automation, code execution, and recovery

The latest release of the Agents SDK makes it easier to build agents that can safely interact with real systems and keep working through interruptions.

Agents can now browse websites through Browser Run, write code against external tools through Codemode, use client-provided tools when delegating to Think sub-agents, and recover more reliably from deploys, Durable Object evictions, and connection churn.

Safer browser automation

Agents can now use Browser Run through a single durable browser_execute tool. Instead of choosing from a fixed list of actions, the model writes code against the Chrome DevTools Protocol (CDP) and can inspect pages, capture screenshots, read rendered content, debug frontend behavior, and interact with live browser sessions.

  • JavaScript

    const browserTools = createBrowserTools({
    ctx: this.ctx,
    browser: this.env.BROWSER,
    loader: this.env.LOADER,
    session: { mode: "dynamic" },
    });
  • TypeScript

    const browserTools = createBrowserTools({
    ctx: this.ctx,
    browser: this.env.BROWSER,
    loader: this.env.LOADER,
    session: { mode: "dynamic" },
    });

Browser sessions can be one-time, reused, or promoted from one-time to persistent during a run. This is useful when an agent needs a human to log in, complete MFA, or approve a sensitive action. The run can pause, keep the same tabs and cookies, and resume after approval.

The browser tools also add Live View URLs, optional session recording, and quick actions such as browser_markdown, browser_extract, browser_links, and browser_scrape for one-shot browsing tasks.

Resumable code execution with approvals

Codemode now uses createCodemodeRuntime, connectors, and a durable execution log. This lets you give a model one codemode tool instead of a large prompt full of tool definitions. The model can discover the capabilities it needs, write code against typed globals, and reuse saved snippets.

  • JavaScript

    const runtime = createCodemodeRuntime({
    ctx: this.ctx,
    executor: new DynamicWorkerExecutor({ loader: this.env.LOADER }),
    connectors: [new GithubConnector(this.ctx, this.env, connection)],
    });
    const result = streamText({
    model,
    messages,
    tools: { codemode: runtime.tool() },
    });
  • TypeScript

    const runtime = createCodemodeRuntime({
    ctx: this.ctx,
    executor: new DynamicWorkerExecutor({ loader: this.env.LOADER }),
    connectors: [new GithubConnector(this.ctx, this.env, connection)],
    });
    const result = streamText({
    model,
    messages,
    tools: { codemode: runtime.tool() },
    });

When the code reaches an approval-gated action, the runtime pauses execution and returns a pending approval. After approval, completed calls replay from the durable log, the approved action runs, and the same code continues. This makes it practical to build agents that create issues, update external systems, or perform other side effects without custom pause-and-resume logic for every tool.

Better Think delegation

Think sub-agents can now use client-defined tools over the RPC chat() path. A parent agent can pass tool schemas with clientTools and resolve tool calls through onClientToolCall. This lets delegated agents use caller-provided capabilities without requiring a browser WebSocket.

  • JavaScript

    await child.chat(message, callback, {
    signal,
    clientTools: [
    {
    name: "get_user_timezone",
    description: "Get the caller's timezone",
    parameters: { type: "object" },
    },
    ],
    onClientToolCall: async ({ toolName, input }) => {
    return runClientTool(toolName, input);
    },
    });
  • TypeScript

    await child.chat(message, callback, {
    signal,
    clientTools: [
    {
    name: "get_user_timezone",
    description: "Get the caller's timezone",
    parameters: { type: "object" },
    },
    ],
    onClientToolCall: async ({ toolName, input }) => {
    return runClientTool(toolName, input);
    },
    });

Think Workflows also improve step.prompt(). A prompt step now runs a full agentic turn before returning structured output, so the agent can call tools before producing the typed result. This makes Workflow steps more useful for durable triage, research, and approval flows.

The unified Think execute tool can also include cdp.* browser capabilities alongside state.* and tools.* when Browser Run is bound.

Voice output device selection

Voice clients can route assistant audio to a specific output device. Use outputDeviceId with useVoiceAgent, or call client.setOutputDevice() from the framework-agnostic client.

  • JavaScript

    const voice = useVoiceAgent({
    agent: "MyVoiceAgent",
    outputDeviceId: selectedSpeakerId,
    });
  • TypeScript

    const voice = useVoiceAgent({
    agent: "MyVoiceAgent",
    outputDeviceId: selectedSpeakerId,
    });

Browsers without speaker-selection support continue playing through the default output device and report a non-fatal outputDeviceError.

Reliability fixes

This release includes several fixes for production agents:

  • useAgent and AgentClient handle WebSocket replacement more reliably during reconnects and configuration changes.
  • Chat stream replay is more reliable after reconnects, deploys, and provider errors.
  • Fiber recovery continues across multi-pass scans and backs off when recovery hooks keep failing.
  • Agent teardown continues even when the request that started teardown is canceled.
  • Large session histories use byte-budgeted reads to reduce memory pressure during startup.

Upgrade

To update to the latest version:

npm i agents@latest @cloudflare/think@latest @cloudflare/codemode@latest @cloudflare/ai-chat@latest @cloudflare/voice@latest
yarn add agents@latest @cloudflare/think@latest @cloudflare/codemode@latest @cloudflare/ai-chat@latest @cloudflare/voice@latest
pnpm add agents@latest @cloudflare/think@latest @cloudflare/codemode@latest @cloudflare/ai-chat@latest @cloudflare/voice@latest
bun add agents@latest @cloudflare/think@latest @cloudflare/codemode@latest @cloudflare/ai-chat@latest @cloudflare/voice@latest

Refer to the Codemode documentation, Browser tools documentation, Think tools documentation, and Voice documentation for more information.

Source: Cloudflare



Latest Posts

Pass It On
Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply