We’re excited to partner with NVIDIA to bring @cf/nvidia/nemotron-3-120b-a12b to Workers AI. NVIDIA Nemotron 3 Super is a Mixture-of-Experts (MoE) model with a hybrid Mamba-transformer architecture, 120B total parameters, and 12B active parameters per forward pass.
The model is optimized for running many collaborating agents per application. It delivers high accuracy for reasoning, tool calling, and instruction following across complex multi-step tasks.
Key capabilities:
- Hybrid Mamba-transformer architecture delivers over 50% higher token generation throughput compared to leading open models, reducing latency for real-world applications
- Tool calling support for building AI agents that invoke tools across multiple conversation turns
- Multi-Token Prediction (MTP) accelerates long-form text generation by predicting several future tokens simultaneously in a single forward pass
- 32,000 token context window for retaining conversation history and plan states across multi-step agent workflows
Use Nemotron 3 Super through the Workers AI binding (env.AI.run()), the REST API, or the OpenAI-compatible endpoint.
For more information, refer to the Nemotron 3 Super model page.
Source: Cloudflare
Latest Posts
- Power Apps – Build generative pages using external code generations tools [MC1268500]
![Power Apps – Build generative pages using external code generations tools [MC1268500] 2 pexels david bartus 43782 1166209](data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==)
- GCP Release Notes: April 01, 2026

- Microsoft Teams: AI Interpreter (simultaneous) quality improvements and new Traditional Chinese support [MC1267977]
![Microsoft Teams: AI Interpreter (simultaneous) quality improvements and new Traditional Chinese support [MC1267977] 4 pexels pixabay 144234](data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==)
- Microsoft Loop: Retirement of Copilot‑generated Recaps [MC1267976]
![Microsoft Loop: Retirement of Copilot‑generated Recaps [MC1267976] 5 pexels scottwebb 1022934](data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==)

![Power Apps – Build generative pages using external code generations tools [MC1268500] 2 pexels david bartus 43782 1166209](https://mwpro.co.uk/wp-content/uploads/2024/08/pexels-david-bartus-43782-1166209-150x150.webp)

![Microsoft Teams: AI Interpreter (simultaneous) quality improvements and new Traditional Chinese support [MC1267977] 4 pexels pixabay 144234](https://mwpro.co.uk/wp-content/uploads/2025/06/pexels-pixabay-144234-150x150.webp)
![Microsoft Loop: Retirement of Copilot‑generated Recaps [MC1267976] 5 pexels scottwebb 1022934](https://mwpro.co.uk/wp-content/uploads/2025/06/pexels-scottwebb-1022934-150x150.webp)
