Amazon SageMaker AI now supports EAGLE (Extrapolation Algorithm for Greater Language-model Efficiency) speculative decoding to improve large language model inference throughput by up to 2.5x. This capability enables models to predict and validate multiple tokens simultaneously rather than one at a time, improving response times for AI applications.
As customers deploy AI applications to production, they need capabilities to serve models with low latency and high throughput to deliver responsive user experiences. Data scientists and ML engineers lack efficient methods to accelerate token generation without sacrificing output quality or requiring complex model re-architecture, making it hard to meet performance expectations under real-world traffic. Teams spend significant time optimizing infrastructure rather than improving their AI applications. With EAGLE speculative decoding, SageMaker AI enables customers to accelerate inference throughput by allowing models to generate and verify multiple tokens in parallel rather than one at a time, maintaining the same output quality while dramatically increasing throughput. SageMaker AI automatically selects between EAGLE 2 and EAGLE 3 based on your model architecture, and provides built-in optimization jobs that use either curated datasets or your own application data to train specialized prediction heads. You can then deploy optimized models through your existing SageMaker AI inference workflow without infrastructure changes, enabling you to deliver faster AI applications with predictable performance.
You can use EAGLE speculative decoding in the following AWS Regions: US East (N. Virginia), US West (Oregon), US East (Ohio), Asia Pacific (Tokyo), Europe (Ireland), Asia Pacific (Singapore), and Europe (Frankfurt)
To learn more about EAGLE speculative decoding, visit AWS News Blog here, and SageMaker AI documentation here.
Categories: general:products/amazon-sagemaker,marketing:marchitecture/artificial-intelligence
Source: Amazon Web Services
Latest Posts
- (Updated) Microsoft 365: Modern Access Request and Access Denied web page [MC1188599]
![(Updated) Microsoft 365: Modern Access Request and Access Denied web page [MC1188599] 2 pexels leish 5258251](data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==)
- (Updated) Introducing Surveys Agent and Copilot Chat in Microsoft Forms [MC1229954]
![(Updated) Introducing Surveys Agent and Copilot Chat in Microsoft Forms [MC1229954] 3 pexels googledeepmind 18068537](data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==)
- (Updated) Upcoming change: disabling Teams meeting recording expiration notification emails [MC1245635]
![(Updated) Upcoming change: disabling Teams meeting recording expiration notification emails [MC1245635] 4 pexels alfonso escalante 1319242 2533092](data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==)
- Endpoint Data Loss Prevention: Always-on diagnostics for Windows endpoints (Phase 2) [MC1246003]
![Endpoint Data Loss Prevention: Always-on diagnostics for Windows endpoints (Phase 2) [MC1246003] 5 pexels icesky08 1294229](data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==)

![(Updated) Microsoft 365: Modern Access Request and Access Denied web page [MC1188599] 2 pexels leish 5258251](https://mwpro.co.uk/wp-content/uploads/2025/06/pexels-leish-5258251-150x150.webp)
![(Updated) Introducing Surveys Agent and Copilot Chat in Microsoft Forms [MC1229954] 3 pexels googledeepmind 18068537](https://mwpro.co.uk/wp-content/uploads/2025/06/pexels-googledeepmind-18068537-150x150.webp)
![(Updated) Upcoming change: disabling Teams meeting recording expiration notification emails [MC1245635] 4 pexels alfonso escalante 1319242 2533092](https://mwpro.co.uk/wp-content/uploads/2025/06/pexels-alfonso-escalante-1319242-2533092-150x150.webp)
![Endpoint Data Loss Prevention: Always-on diagnostics for Windows endpoints (Phase 2) [MC1246003] 5 pexels icesky08 1294229](https://mwpro.co.uk/wp-content/uploads/2025/06/pexels-icesky08-1294229-150x150.webp)
