Workers AI – Google Gemma 4 26B A4B now available on Workers AI

Workers AI – Google Gemma 4 26B A4B now available on Workers AI

We are partnering with Google to bring @cf/google/gemma-4-26b-a4b-it to Workers AI. Gemma 4 26B A4B is a Mixture-of-Experts (MoE) model built from Gemini 3 research, with 26B total parameters and only 4B active per forward pass. By activating a small subset of parameters during inference, the model runs almost as fast as a 4B-parameter model while delivering the quality of a much larger one.

Gemma 4 is Google’s most capable family of open models, designed to maximize intelligence-per-parameter.

Key capabilities

  • Mixture-of-Experts architecture with 8 active experts out of 128 total (plus 1 shared expert), delivering frontier-level performance at a fraction of the compute cost of dense models
  • 256,000 token context window for retaining full conversation history, tool definitions, and long documents across extended sessions
  • Built-in thinking mode that lets the model reason step-by-step before answering, improving accuracy on complex tasks
  • Vision understanding for object detection, document and PDF parsing, screen and UI understanding, chart comprehension, OCR (including multilingual), and handwriting recognition, with support for variable aspect ratios and resolutions
  • Function calling with native support for structured tool use, enabling agentic workflows and multi-step planning
  • Multilingual with out-of-the-box support for 35+ languages, pre-trained on 140+ languages
  • Coding for code generation, completion, and correction

Use Gemma 4 26B A4B through the Workers AI binding (env.AI.run()), the REST API at /run or /v1/chat/completions, or the OpenAI-compatible endpoint.

For more information, refer to the Gemma 4 26B A4B model page.

Source: Cloudflare



Latest Posts

Pass It On
Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply