SageMaker JumpStart now offers optimized deployments, enabling customers to deploy foundation models with pre-configured settings tailored to specific use cases and performance constraints. SageMaker JumpStart optimized deployments simplify model deployment by offering task-aware configurations that optimize for cost, throughput, or latency based on your workload requirements – whether content generation, summarization, or Q&A. This launch includes support for 30+ popular models from Meta, Microsoft, Mistral AI, Qwen, Google, and TII, with visibility into key performance metrics like P50 latency, time-to-first token (TTFT), and throughput before deployment.
With SageMaker JumpStart optimized deployments, customers can select from use case-specific configurations (such as generative writing or chat-style interactions) and choose optimization targets including cost-optimized, throughput-optimized, latency-optimized, or balanced performance. Models deploy to SageMaker AI Managed Inference endpoints or SageMaker HyperPod clusters with pre-set configurations that eliminate guesswork while maintaining full visibility into deployment details. Available models include Meta Llama 3.1 and 3.2 variants, Microsoft Phi-3, Mistral AI models including the new Mistral-Small-24B-Instruct-2501, Qwen 2 and 3 series including multimodal Qwen2-VL, Google Gemma, and TII Falcon3. All deployments leverage SageMaker’s VPC deployment capabilities, ensuring data control and production-ready infrastructure with enterprise-grade security. The feature is available in all AWS regions where SageMaker JumpStart is curretly supported.
To get started with optimized deployments, navigate to Models in SageMaker Studio, select your desired foundation model in the JumpStart Models tab, choose “Deploy,” and select your use case and performance optimization target. For details, visit the SageMaker JumpStart documentation. AWS is actively expanding support to include additional models.
Categories: general:products/amazon-sagemaker,general:products/amazon-sagemaker-jumpstart,marketing:marchitecture
Source: Amazon Web Services
Latest Posts
- AWS Marketplace introduces Tax management portal for sellers

- AWS Service Catalog is now available in the AWS Asia Pacific (New Zealand) and Canada West (Calgary) regions

- Amazon Route 53 Global Resolver now lets you add and remove AWS Regions for anycast DNS resolution

- Power Platform CoE Starter Kit vs Native Governance: Why Microsoft Is Moving On (and What Admins Should Do)





![Microsoft Teams: Meeting organizers can automatically start transcription without recording [MC1283816] 7 Microsoft Teams: Meeting organizers can automatically start transcription without recording [MC1283816]](https://mwpro.co.uk/wp-content/uploads/2024/08/pexels-joaojesusdesign-921294-150x150.webp)