Workers AI – Google Gemma 4 26B A4B now available on Workers AI

Modern Workspace Pro 5 April 2026No CommentsCloudflare, Workers AI

We are partnering with Google to bring @cf/google/gemma-4-26b-a4b-it to Workers AI. Gemma 4 26B A4B is a Mixture-of-Experts (MoE) model built from Gemini 3 research, with 26B total parameters and only 4B active per forward pass. By activating a small subset of parameters during inference, the model runs almost as fast as a 4B-parameter model while delivering the quality of a much larger one.

Gemma 4 is Google’s most capable family of open models, designed to maximize intelligence-per-parameter.

Key capabilities

Mixture-of-Experts architecture with 8 active experts out of 128 total (plus 1 shared expert), delivering frontier-level performance at a fraction of the compute cost of dense models
256,000 token context window for retaining full conversation history, tool definitions, and long documents across extended sessions
Built-in thinking mode that lets the model reason step-by-step before answering, improving accuracy on complex tasks
Vision understanding for object detection, document and PDF parsing, screen and UI understanding, chart comprehension, OCR (including multilingual), and handwriting recognition, with support for variable aspect ratios and resolutions
Function calling with native support for structured tool use, enabling agentic workflows and multi-step planning
Multilingual with out-of-the-box support for 35+ languages, pre-trained on 140+ languages
Coding for code generation, completion, and correction

Use Gemma 4 26B A4B through the Workers AI binding (env.AI.run()), the REST API at /run or /v1/chat/completions, or the OpenAI-compatible endpoint.

For more information, refer to the Gemma 4 26B A4B model page.

Source: Cloudflare

Latest Posts

Pass It On

Comments

No comments yet. Why don’t you start the discussion?

Key capabilities

Latest Posts

Comments

Leave a Reply Cancel reply