Today, Amazon EKS announces support for up to 100,000 worker nodes in a cluster, enabling you to run ultra scale AI/ML training and inference workloads in a single cluster. With Amazon EC2’s new generation accelerated computing instance types, 100,000 worker nodes support up to 1.6 million Trainium chips with Trn2 instances and 800,000 NVIDIA GPUs with P5 and P6 instances in a single cluster. This enables you to run ultra scale AI/ML workloads that require all compute accelerators to be available within a single cluster, as these workloads cannot be easily distributed across multiple clusters.
The most advanced AI models with trillions of parameters demonstrate significantly enhanced capabilities in understanding context, reasoning, and solving complex tasks. To build and operate these increasingly powerful models, organizations require access to massive numbers of compute accelerators in a single cluster. Consolidated access to such a large pool of compute accelerators delivers crucial benefits: allows organizations to build and deploy more powerful AI models than ever before, reduces costs by efficiently sharing compute accelerators between training and inference workloads, and enables seamless use of existing AI/ML tools and frameworks that are not designed to work across clusters.
To learn more, see the launch blog.
Categories:
Source: Amazon Web Services
Latest Posts
- [Action Required] Update scripts using Get-MailDetailTransportRuleReport and Get-MailTrafficPolicyReport [MC1323250]
![[Action Required] Update scripts using Get-MailDetailTransportRuleReport and Get-MailTrafficPolicyReport [MC1323250] 2 pexels earano 3608311](data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==)
- Amazon SageMaker HyperPod Slurm clusters now support specifying minimum capacity requirements with continuous provisioning

- (Updated) Microsoft Teams: Rule-based enablement of Microsoft 365 third-party apps in the Teams admin center [MC1085133]
![(Updated) Microsoft Teams: Rule-based enablement of Microsoft 365 third-party apps in the Teams admin center [MC1085133] 4 pexels tirachard kumtanom 112571 347139](data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==)
- Dynamics 365 Contact Center – Update to provide greater granularity to session rejection reasons [MC1324072]
![Dynamics 365 Contact Center – Update to provide greater granularity to session rejection reasons [MC1324072] 5 puppet 1636124 1920](data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==)

![[Action Required] Update scripts using Get-MailDetailTransportRuleReport and Get-MailTrafficPolicyReport [MC1323250] 2 pexels earano 3608311](https://mwpro.co.uk/wp-content/uploads/2024/08/pexels-earano-3608311-150x150.webp)

![(Updated) Microsoft Teams: Rule-based enablement of Microsoft 365 third-party apps in the Teams admin center [MC1085133] 4 pexels tirachard kumtanom 112571 347139](https://mwpro.co.uk/wp-content/uploads/2024/08/pexels-tirachard-kumtanom-112571-347139-150x150.webp)
![Dynamics 365 Contact Center – Update to provide greater granularity to session rejection reasons [MC1324072] 5 puppet 1636124 1920](https://mwpro.co.uk/wp-content/uploads/2025/06/puppet-1636124_1920-150x150.webp)
