As of today, AWS Batch now supports scheduling for SageMaker Training jobs. With AWS Batch for SageMaker Training jobs, data scientists are able to submit training jobs to configurable queues powered by AWS Batch. This integration enables jobs to be scheduled based on priority and resource availability, eliminating manual retries and coordination. Additionally, system administrators can set up fair-share scheduling policies to optimize resource utilization across teams. The system will automatically retry failed jobs and provide visibility into queue status.
You can also procure SageMaker Flexible Training Plans (FTP) to guarantee the capacity you need during the time you need it. With a Flexible Training Plan in place, Batch’s queuing capabilities allows you to maximize your utilization for the duration of your plan. Data scientists can submit experiments with confidence directly from the SageMaker Python SDK, knowing that infrastructure complexities are handled automatically.
You can start using AWS Batch for SageMaker Training jobs immediately through the AWS Management Console, AWS Command Line Interface (CLI), or AWS SDKs. There are no additional charges for AWS Batch itself – you only pay for the AWS resources used to run your applications. AWS Batch for SageMaker Training jobs is now generally available in all commercial AWS Regions where AWS Batch and SageMaker AI are available. To get started, see the AWS Batch for SageMaker Training jobs documentation and our blog post.
Categories:
Source: Amazon Web Services
Latest Posts
- Dynamics 365 Customer Insights – Journeys – Create event portals with event registration details using Power Pages [MC1126999]
- Amazon S3 Access Points now support tags for Attribute-Based Access Control
- Amazon EC2 now supports force terminate for EC2 instances
- Microsoft Copilot Studio – Upgraded support for Adaptive Cards in Copilot Studio [MC1126842]