Software 44035 Published by

The latest version of ROCm, 7.2.0, has been released with several enhancements aimed at improving support for new AMD hardware and operating systems. This update includes support for newer graphics processing units (GPUs), such as RDNA4 architecture models, as well as improvements to virtualization support and performance optimizations across different GPU types and applications. The ROCm 7.2.0 release also features technical tweaks like boosted memory set operations and improved handling of jobs run asynchronously by the GPU, which collectively aim to increase computational throughput on AMD graphics processors. Additionally, the update includes new functionality through APIs, refreshed tools for life sciences workloads, and updates to deep learning frameworks like JAX.



ROCm 7.2.0 released

The latest version of ROCm, 7.2.0, has been released with a range of enhancements aimed at improving support for new AMD hardware, operating systems, and system components.

A key change here involves support for newer graphics processing units (GPUs). This release officially includes RDNA4 architecture GPUs like the Radeon AI PRO R9600D and RX 9060 XT LP, alongside continued backing for RDNA3-based models such as the RX 7700. There's also an extension of virtualization support specifically added for SLES 15 SP7 when using AMD Instinct MI355X and MI350X GPUs.

Managing complex GPU systems, especially those used in data centers, always requires attention to detail across hardware and software layers. ROCm 7.2.0 tries to help with a feature called Node Power Management (NPM). It focuses on smartly distributing power among multiple AMD GPUs within the same server node, which is particularly handy for setups using MI355X or MI350X models in supported environments.

Performance improvements are naturally part of this update cycle. Reports suggest noticeable gains across different GPU types and applications. For example, throughput and latency have been measured lower when running Llama 3.1 models on MI355X GPUs. Similar optimizations for the GLM-4.6 and DeepEP models now specifically target AMD Instinct MI300X hardware.

On the development side, there are solid enhancements to the HIP runtime environment itself. The team has improved its doorbell mechanism, something that helps manage data transfer notifications between CPUs and GPUs, especially when building complex computational graphs for execution. This brings performance benefits that mirror what NVIDIA offers with their CUDA Graph optimizations and makes graph-based workloads easier to handle.

Other technical tweaks include boosted memory set operations, that can speed up certain calculations, and fewer bottlenecks in handling jobs run asynchronously by the GPU. These changes collectively aim to push more computational throughput out of AMD's graphics processors.

Beyond raw performance, ROCm keeps adding new functionality through APIs. You'll find fresh HIP API additions related to libraries, managing GPU memory directly, and controlling execution streams. There's also progress on the backend conduit for rocSHMEM; the improvement means better ways for GPUs to communicate among themselves without needing CPU help, covering both within a single node and across multiple nodes.

The update doesn't stop at just the core runtime or basic communication either. The ecosystem gets attention too: tools tailored for life sciences workloads have gotten refreshed. And deep learning frameworks like JAX are now benefitting from updates to components such as MIOpen (AMD's custom library layer for operations) and MIGraphX (the AMD Graph compiler). There was also a push for ONNX recently, further broadening the reach of ROCm libraries.

For users tracking down problems or just wanting clearer guidance, the release notes mention several fixes. This includes smoothing out issues with the ROCm Runfile Installer and making more examples available in the ROCm examples repository itself. And as always, documentation gets updated alongside the code to provide better clarity for everyone involved.

Release rocm-7.2.0

Updated rocm-7.2.0

Release rocm-7.2.0 ยท ROCm/ROCm