Amazon CloudWatch Container Insights now supports Neuron UltraServers on Amazon EKS

Amazon CloudWatch Container Insights now supports Neuron UltraServers on Amazon EKS

Amazon CloudWatch Container Insights now supports Neuron UltraServers on Amazon EKS, providing enhanced observability for customers running large-scale, high-performance machine learning workloads on multi-instance nodes. This new capability enables data scientists and ML engineers to efficiently monitor and troubleshoot their containerized ML applications, offering aggregated metrics and simplified management across Neuron UltraServer groups.

Neuron UltraServers combine multiple EC2 instances into a single logical server unit, optimized for machine learning workloads using AWS Trainium and Inferentia accelerators. Container Insights, a monitoring and diagnostics feature in Amazon CloudWatch, automatically collects metrics from containerized applications. With this launch, Container Insights introduces a new filter specifically for UltraServers in EKS environments. You can now select an UltraServer ID to view new aggregate metrics across all instances within that server, replacing the need to monitor individual instances separately. In addition to per-instance metrics, you can now view consolidated performance data for the entire UltraServer group, streamlining the monitoring of ML workloads running on AWS Neuron.

Amazon CloudWatch Container Insights is available in all commercial AWS Regions, and the AWS GovCloud (US).

To get started, see AWS Neuron metrics for AWS Trainium and AWS Inferentia in the Amazon CloudWatch User Guide

Categories: marketing:marchitecture/containers,general:products/amazon-eks,general:products/aws-govcloud-us,marketing:marchitecture/management-and-governance,general:products/amazon-cloudwatch

Source: Amazon Web Services



Latest Posts

Pass It On
Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply