Amazon SageMaker AI now supports inference recommendations, a new capability that eliminates manual optimization and benchmarking to deliver optimal inference performance. By delivering validated, optimal deployment configurations with performance metrics, SageMaker AI accelerates the path to production and keeps your model developers focused on building accurate models, not managing infrastructure.
Customers bring their own generative AI models, define expected traffic patterns, and specify a performance goal (optimize for cost, minimize latency, or maximize throughput). SageMaker AI then analyzes the model’s architecture and applies optimizations aligned to that goal across multiple instance types, benchmarking each configuration on real GPU infrastructure using NVIDIA AIPerf. By evaluating multiple instance types, customers can select the most price-performant option for their workload. The result is deployment-ready configurations with validated metrics including time to first token, inter-token latency, request latency percentiles, throughput, and cost projections.
The capability is available today in seven AWS Regions: US East (N. Virginia), US West (Oregon), US East (Ohio), Asia Pacific (Tokyo), Europe (Ireland), Asia Pacific (Singapore), and Europe (Frankfurt). To learn more, visit the SageMaker AI documentation.
Categories: marketing:marchitecture/artificial-intelligence,general:products/amazon-sagemaker
Source: Amazon Web Services
Latest Posts
- Updates available for Microsoft 365 Apps for Current Channel [MC1333091]
![Updates available for Microsoft 365 Apps for Current Channel [MC1333091] 2 keyboard 886462 1920](data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==)
- Amazon EKS and Amazon EKS Distro now supports Kubernetes version 1.36

- Workflows, Workers – Schedule Workflow instances directly from your Workflow binding

- Action required: Update Teams Rooms app to maintain PowerPoint Live functionality [MC1332812]
![Action required: Update Teams Rooms app to maintain PowerPoint Live functionality [MC1332812] 5 pexels ben neale 123878 380337](data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==)

![Updates available for Microsoft 365 Apps for Current Channel [MC1333091] 2 keyboard 886462 1920](https://mwpro.co.uk/wp-content/uploads/2025/06/keyboard-886462_1920-150x150.webp)


![Action required: Update Teams Rooms app to maintain PowerPoint Live functionality [MC1332812] 5 pexels ben neale 123878 380337](https://mwpro.co.uk/wp-content/uploads/2024/08/pexels-ben-neale-123878-380337-150x150.webp)
![(Updated) Auto upgrade of shared calendars from legacy MAPI model to modern REST model [MC1287370] 7 (Updated) Auto upgrade of shared calendars from legacy MAPI model to modern REST model [MC1287370]](https://mwpro.co.uk/wp-content/uploads/2024/08/pexels-steve-27594600-150x150.webp)