Introducing GLM 4-7 Flash on Workers AI & TanStack AI

We’re excited to announce GLM-4.7-Flash on Workers AI, a fast and efficient text generation model optimized for multilingual dialogue and instruction-following tasks, along with the brand-new @cloudflare/tanstack-ai package and workers-ai-provider v3.1.1.

You can now run AI agents entirely on Cloudflare. With GLM-4.7-Flash’s multi-turn tool calling support, plus full compatibility with TanStack AI and the Vercel AI SDK, you have everything you need to build agentic applications that run completely at the edge.

GLM-4.7-Flash — Multilingual Text Generation Model

@cf/zai-org/glm-4.7-flash is a multilingual model with a 131,072 token context window, making it ideal for long-form content generation, complex reasoning tasks, and multilingual applications.

Key Features and Use Cases:

Multi-turn Tool Calling for Agents: Build AI agents that can call functions and tools across multiple conversation turns
Multilingual Support: Built to handle content generation in multiple languages effectively
Large Context Window: 131,072 tokens for long-form writing, complex reasoning, and processing long documents
Fast Inference: Optimized for low-latency responses in chatbots and virtual assistants
Instruction Following: Excellent at following complex instructions for code generation and structured tasks

Use GLM-4.7-Flash through the Workers AI binding (env.AI.run()), the REST API at /run or /v1/chat/completions, AI Gateway, or via workers-ai-provider for the Vercel AI SDK.

Pricing is available on the model page or pricing page.

@cloudflare/tanstack-ai v0.1.1 — TanStack AI adapters for Workers AI and AI Gateway

We’ve released @cloudflare/tanstack-ai, a new package that brings Workers AI and AI Gateway support to TanStack AI. This provides a framework-agnostic alternative for developers who prefer TanStack’s approach to building AI applications.

Workers AI adapters support four configuration modes — plain binding (env.AI), plain REST, AI Gateway binding (env.AI.gateway(id)), and AI Gateway REST — across all capabilities:

Chat (createWorkersAiChat) — Streaming chat completions with tool calling, structured output, and reasoning text streaming.
Image generation (createWorkersAiImage) — Text-to-image models.
Transcription (createWorkersAiTranscription) — Speech-to-text.
Text-to-speech (createWorkersAiTts) — Audio generation.
Summarization (createWorkersAiSummarize) — Text summarization.

AI Gateway adapters route requests from third-party providers — OpenAI, Anthropic, Gemini, Grok, and OpenRouter — through Cloudflare AI Gateway for caching, rate limiting, and unified billing.

To get started:

npm install @cloudflare/tanstack-ai @tanstack/ai

workers-ai-provider v3.1.1 — transcription, speech, reranking, and reliability

The Workers AI provider for the Vercel AI SDK now supports three new capabilities beyond chat and image generation:

Transcription (provider.transcription(model)) — Speech-to-text with automatic handling of model-specific input formats across binding and REST paths.
Text-to-speech (provider.speech(model)) — Audio generation with support for voice and speed options.
Reranking (provider.reranking(model)) — Document reranking for RAG pipelines and search result ordering.

import { createWorkersAI } from "workers-ai-provider";
import {
  experimental_transcribe,
  experimental_generateSpeech,
  rerank,
} from "ai";

const workersai = createWorkersAI({ binding: env.AI });

const transcript = await experimental_transcribe({
  model: workersai.transcription("@cf/openai/whisper-large-v3-turbo"),
  audio: audioData,
  mediaType: "audio/wav",
});

const speech = await experimental_generateSpeech({
  model: workersai.speech("@cf/deepgram/aura-1"),
  text: "Hello world",
  voice: "asteria",
});

const ranked = await rerank({
  model: workersai.reranking("@cf/baai/bge-reranker-base"),
  query: "What is machine learning?",
  documents: ["ML is a branch of AI.", "The weather is sunny."],
});

This release also includes a comprehensive reliability overhaul (v3.0.5):

Fixed streaming — Responses now stream token-by-token instead of buffering all chunks, using a proper TransformStream pipeline with backpressure.
Fixed tool calling — Resolved issues with tool call ID sanitization, conversation history preservation, and a heuristic that silently fell back to non-streaming mode when tools were defined.
Premature stream termination detection — Streams that end unexpectedly now report finishReason: "error" instead of silently reporting "stop".
AI Search support — Added createAISearch as the canonical export (renamed from AutoRAG). createAutoRAG still works with a deprecation warning.

To upgrade:

npm install workers-ai-provider@latest ai

Resources

Source: Cloudflare

Latest Posts

Pass It On

AI Logo Generator on GCP Release Notes: February 14, 202617 February 2026
It’s exciting to see Google SecOps SIEM and SOAR updates rolling out globally. I’m curious if there are any significant…
AI Music Generator on GCP Release Notes: February 13, 202615 February 2026
Security bulletins like the one for CVE-2025-13292 are always a bit concerning, but it’s reassuring to know Google is actively…
Nano Banana AI on GCP Release Notes: February 12, 202614 February 2026
This migration tool seems like a great step for those looking to modernize their infrastructure. Being able to move App…
Trusted Script Enforcement in SharePoint Online: What’s Changing and How to Prepare with SPFx - Proteus on Content Security Policies (CSP) are coming to SharePoint Online and might impact your custom SPFx solutions [MC1193419]12 February 2026
[…] CSP is enforced, these patterns will cause breakages. Microsoft’s SharePoint team confirms that any SPFx solution loading scripts from…
Nana Banana UK AI Image Editor on Microsoft Teams: Meeting participants can request collaborative annotation sessions [MC1019312]10 February 2026
This collaborative annotation feature sounds like a huge leap forward for real-time screen sharing in Teams. Being able to instantly…

Workers, Agents, Workers AI – Introducing GLM-4.7-Flash on Workers AI, @cloudflare/tanstack-ai, and workers-ai-provider v3.1.1

GLM-4.7-Flash — Multilingual Text Generation Model

@cloudflare/tanstack-ai v0.1.1 — TanStack AI adapters for Workers AI and AI Gateway

workers-ai-provider v3.1.1 — transcription, speech, reranking, and reliability

Resources

Latest Posts

Comments

Leave a Reply Cancel reply

Modern Workspace Pro