asmodai, to Amd
@asmodai@mastodon.social avatar

AMD Working To Release MES Documentation & Source Code

"[..], AMD now says they will be releasing documentation followed by the source code for their Micro-Engine Scheduler (MES) IP block found within Radeon GPUs."

Note: towards the end of May

https://www.phoronix.com/news/AMD-MES-Docs-And-Source-Code

civodul, to guix
@civodul@toot.aquilenet.fr avatar

“HIP and come to
https://hpc.guix.info/blog/2024/01/hip-and-rocm-come-to-guix/

New blog post on the 100+ Guix packages contributed by AMD, our preliminary tests on one the French national supercomputers, and how it can benefit going forward to both AMD and the French and European environments.

tristan, to Amd

Turns out that the mobile processor in this server is so new that hasn't released libraries for the integrated GPU, a 780M - GFX1103, yet. "New" is relative; AMD is, seemingly, terrible at releasing code to keep pace with hardware releases. To contextualize how terrible, ROCM just got support for the 7900XT in the last 3 months; this GPU was released in December of 2022. I wish external GPU enclosures weren't a million dollars.

giuseppebilotta, to linux
@giuseppebilotta@fediscience.org avatar

Damn, upgrading to kernel 6.6.9 seems to have broken on my setup

amdgpu 0000:66:00.0: amdgpu: bo 000000008744cead va 0x0800000000-0x08000001ff conflict with 0x0800000000-0x0800000002

giuseppebilotta,
@giuseppebilotta@fediscience.org avatar

Ouch, this is not reproducible without , which means it might just be ROCm not cleaning up properly after itself, or something else. Interesting, switching the order in which the platforms are initialized deadlocks ROCm instead of segfaulting

harish, to Amd
@harish@hachyderm.io avatar

So I bought a fancy #AMD graphics card because I didn‘t want to support the #Nvidia #CUDA hegemony. I also had high hopes for their supposedly more open drivers.

I am not sure if this was a great idea, because while it‘s been super good for my kids and their games, it‘s been a steep uphill climb (both ways) to get #ROCm and #HIP to do anything.

And the core bits are distributed as these precompiled packages that only work on a handful of specific versions of Linux distributions.

gabmus, to linux
@gabmus@fosstodon.org avatar

@oblomov I've been asked today, is #ROCm (on #linux) any good these days? I haven't really been into the #gpgpu space for a while.

Also #askfedi

mwibral,
@mwibral@mastodon.social avatar

@gabmus
I use #rocm 5.7 to run #opencl, google's #jax (for pymc), and #pytorch on two vega cards (Vega 64 and Radeon pro WX9100) on arch and ubuntu. They all run Ok, but correct setup needs some googling around, and jax beeds exporting some #xla flags. Situation is much, much better than 2 years ago, though.
@oblomov

boilingsteam, to linux
@boilingsteam@mastodon.cloud avatar
giuseppebilotta, (edited ) to Amd
@giuseppebilotta@fediscience.org avatar

Anyway, as I mentioned recently, I have a new workstation that finally allows me to test our code using all three backends (, /‌ and w/ ) thanks to having an processor with an integrated in addition to a discrete ‌ GPU.
Of course the iGPU is massively underpowered compared to the high-end dGPU workhorse, but I would expect it to outperform the CPU on most workloads.
And this is where things get interesting.

giuseppebilotta,
@giuseppebilotta@fediscience.org avatar

So, one of the reasons why we could implement the #HIP backend easily in #GPUSPH is that #AMD provides #ROCm drop-in replacement for much of the #NVIDIA #CUDA‌ libraries, including #rocThrust, which (as I mentioned in the other thread) is a fork of #Thrust with a #HIP/‌#ROCm backend.
This is good as it reduces porting effort, but it also means you have to trust the quality of the provided implementation.

giuseppebilotta, to random
@giuseppebilotta@fediscience.org avatar

While I was away for #SPHERIC, they delivered a couple of new workstations to the office, so today I spent some time setting one up, installing the OS, checking it matched the specs and testing the hardware. One thing I hadn't realized when we ordered this one is that in addition to the NVIDIA GPUs it would also give access to the integrated AMD‌ GPU. This is, I think, my first system where I have hardware from both vendors (although of very different class).

giuseppebilotta,
@giuseppebilotta@fediscience.org avatar

So obviously I took the opportunity to install both #CUDA and #HIP‌/‌#ROCm and make sure our software still built and ran correctly.
And honestly, it's unpleasant that in 2023 you still have to do some hoop jumping for either platform.
With CUDA, the issue is always making sure that you have a supported gcc version.
With ROCm, it's easy to trip on unsupported/untested hardware to find the right combination of env vars and define to make it go.

gpuopen, to random
@gpuopen@mastodon.gamedev.place avatar

Need to run large MPI jobs across multiple GPUs?

Don't miss our latest blog post covering -aware MPI with support 👇

https://gpuopen.com/learn/amd-lab-notes/amd-lab-notes-gpu-aware-mpi-readme/?utm_source=mastodon&utm_medium=social&utm_campaign=amd-lab-notes

giuseppebilotta, to random
@giuseppebilotta@fediscience.org avatar

Corporate at its worst: controls the library and its , and backend. provides rocThrust, that is just Thrust with the CUDA part stripped and a new backend for / . Nobody* is working on a backend for
provides its own alternative as , which is NOT a drop-in replacement.

This is why we can't have nice things.

*there's a dead project here
https://github.com/wdmapp/syclthrust

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • kavyap
  • DreamBathrooms
  • thenastyranch
  • magazineikmin
  • tacticalgear
  • khanakhh
  • Youngstown
  • mdbf
  • slotface
  • rosin
  • everett
  • ngwrru68w68
  • Durango
  • megavids
  • InstantRegret
  • cubers
  • GTA5RPClips
  • cisconetworking
  • ethstaker
  • osvaldo12
  • modclub
  • normalnudes
  • provamag3
  • tester
  • anitta
  • Leos
  • lostlight
  • All magazines