Amazon CloudWatch Container Insights now supports collection of GPU metrics at sub-minute frequencies for AI and ML workloads running on Amazon EKS. Customers can configure the metric sample rate in seconds, enabling more granular monitoring of GPU resource utilization.
This enhancement enables customers to effectively monitor GPU-intensive workloads that run for less than 60 seconds, such as ML inference jobs that consume GPU resources for short durations. By increasing the sampling frequency, customers can maintain detailed visibility into short-lived GPU workloads. Sub-minute GPU metrics are sent to CloudWatch once per minute. This granular monitoring helps customers optimize their GPU resource utilization, troubleshoot performance issues, and ensure efficient operation of their containerized GPU applications.
Sub-Minute GPU metrics in Container Insights is available in all AWS Commercial Regions and the AWS GovCloud (US) Regions.
To learn more about Sub-Minute GPU metrics in Container Insights, visit the NVIDIA GPU metrics page in the Amazon CloudWatch User Guide. Sub-Minute GPU metrics in Container Insights are available for no addition cost. For Container Insights pricing, see the Amazon CloudWatch Pricing Page.
Categories: general:products/aws-govcloud-us,general:products/amazon-cloudwatch,marketing:marchitecture/management-and-governance
Source: Amazon Web Services
Latest Posts
- AWS Application and Network Load Balancers Now Support Post-Quantum Key Exchange for TLS

- Amazon Simple Email Service is now available in two new AWS Regions

- AWS IoT Core enhances IoT rules-SQL with variable setting and error handling capabilities

- AWS Application Load Balancer now supports Health Check Logs






