Today, AWS announces the general availability of Neuron 2.24, delivering new features and performance improvements for customers building and deploying deep learning models on AWS Inferentia and Trainium-based instances. Neuron 2.24 introduces support for PyTorch 2.7, enhanced inference capabilities, and expanded compatibility with popular machine learning frameworks. These updates help developers and data scientists accelerate model training and inference, improve efficiency, and simplify the deployment of large language models and other AI workloads.
With Neuron 2.24, customers can take advantage of advanced inference features such as prefix caching for faster Time-To-First-Token (TTFT), disaggregated inference to reduce prefill-decode interference, and context parallelism for improved performance on long sequences. The release also brings support for Qwen 2.5 text models and improved integration with Hugging Face Optimum Neuron and PyTorch-based NxD Core backend.
Neuron 2.24 is available in all AWS Regions where Inferentia and Trainium instances are offered.
To learn more and for a full list of new features and enhancements, see:
Categories: general:products/amazon-machine-learning,marketing:marchitecture/compute,general:products/aws-tools-and-sdks
Source: Amazon Web Services
Latest Posts
- Amazon OpenSearch Serverless now supports backup and restore through the AWS Management Console

- Amazon Connect now provides conversational analytics for voice and chat bots

- Amazon EC2 M7i instances are now available in the Europe (Zurich) Region

- Amazon Inspector supports organization-wide management through AWS Organizations policies






![Microsoft Copilot Studio – Group files with instructions to guide and improve agent answers [MC1107954] 7 Microsoft Copilot Studio – Group files with instructions to guide and improve agent answers [MC1107954]](https://mwpro.co.uk/wp-content/uploads/2024/08/pexels-strangehappenings-14632013-150x150.webp)