AMD Working To Release MES Documentation & Source Code
"[..], AMD now says they will be releasing documentation followed by the source code for their Micro-Engine Scheduler (MES) IP block found within Radeon GPUs."
New blog post on the 100+ Guix packages contributed by AMD, our preliminary tests on one the French national supercomputers, and how it can benefit going forward to both AMD and the French and European #HPC environments.
Turns out that the mobile processor in this server is so new that #AMD hasn't released #ROCM libraries for the integrated GPU, a #Radeon 780M - GFX1103, yet. "New" is relative; AMD is, seemingly, terrible at releasing code to keep pace with hardware releases. To contextualize how terrible, ROCM just got support for the 7900XT in the last 3 months; this GPU was released in December of 2022. I wish external GPU enclosures weren't a million dollars.
Ouch, this is not reproducible without #ROCm, which means it might just be ROCm not cleaning up properly after itself, or something else. Interesting, switching the order in which the platforms are initialized deadlocks ROCm instead of segfaulting #rusticl
So I bought a fancy #AMD graphics card because I didn‘t want to support the #Nvidia#CUDA hegemony. I also had high hopes for their supposedly more open drivers.
I am not sure if this was a great idea, because while it‘s been super good for my kids and their games, it‘s been a steep uphill climb (both ways) to get #ROCm and #HIP to do anything.
And the core bits are distributed as these precompiled packages that only work on a handful of specific versions of Linux distributions.
@gabmus
I use #rocm 5.7 to run #opencl, google's #jax (for pymc), and #pytorch on two vega cards (Vega 64 and Radeon pro WX9100) on arch and ubuntu. They all run Ok, but correct setup needs some googling around, and jax beeds exporting some #xla flags. Situation is much, much better than 2 years ago, though. @oblomov
Anyway, as I mentioned recently, I have a new workstation that finally allows me to test our code using all three backends (#CUDA, #ROCm/#HIP and #CPU w/ #OpeMP) thanks to having an #AMD#Ryzen processor with an integrated #GPU in addition to a discrete #NVIDIA GPU.
Of course the iGPU is massively underpowered compared to the high-end dGPU workhorse, but I would expect it to outperform the CPU on most workloads.
And this is where things get interesting.
So, one of the reasons why we could implement the #HIP backend easily in #GPUSPH is that #AMD provides #ROCm drop-in replacement for much of the #NVIDIA#CUDA libraries, including #rocThrust, which (as I mentioned in the other thread) is a fork of #Thrust with a #HIP/#ROCm backend.
This is good as it reduces porting effort, but it also means you have to trust the quality of the provided implementation.
While I was away for #SPHERIC, they delivered a couple of new workstations to the office, so today I spent some time setting one up, installing the OS, checking it matched the specs and testing the hardware. One thing I hadn't realized when we ordered this one is that in addition to the NVIDIA GPUs it would also give access to the integrated AMD GPU. This is, I think, my first system where I have hardware from both vendors (although of very different class).
So obviously I took the opportunity to install both #CUDA and #HIP/#ROCm and make sure our software still built and ran correctly.
And honestly, it's unpleasant that in 2023 you still have to do some hoop jumping for either platform.
With CUDA, the issue is always making sure that you have a supported gcc version.
With ROCm, it's easy to trip on unsupported/untested hardware to find the right combination of env vars and define to make it go.
Corporate #FLOSS at its worst: #NVIDIA controls the #Thrust library and its #CUDA, #OpenMP and #TBB backend. #AMD provides rocThrust, that is just Thrust with the CUDA part stripped and a new backend for #ROCm / #HIP. Nobody* is working on a backend for #SYCL #Intel provides its own #oneAPI alternative as #oneDPL, which is NOT a drop-in replacement.