#ROCm - kbin.social

asmodai, 1 month ago to Amd

AMD Working To Release MES Documentation & Source Code

"[..], AMD now says they will be releasing documentation followed by the source code for their Micro-Engine Scheduler (MES) IP block found within Radeon GPUs."

Note: towards the end of May

https://www.phoronix.com/news/AMD-MES-Docs-And-Source-Code

#AMD #MES #Firmware #Documentation #ROCm

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ wonka

civodul, 3 months ago to guix

“HIP and #ROCm come to #Guix”
https://hpc.guix.info/blog/2024/01/hip-and-rocm-come-to-guix/

New blog post on the 100+ Guix packages contributed by AMD, our preliminary tests on one the French national supercomputers, and how it can benefit going forward to both AMD and the French and European #HPC environments.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ Z572, hako, janneke, abcdw

tristan, 4 months ago to Amd

Turns out that the mobile processor in this server is so new that #AMD hasn't released #ROCM libraries for the integrated GPU, a #Radeon 780M - GFX1103, yet. "New" is relative; AMD is, seemingly, terrible at releasing code to keep pace with hardware releases. To contextualize how terrible, ROCM just got support for the 7900XT in the last 3 months; this GPU was released in December of 2022. I wish external GPU enclosures weren't a million dollars.

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

giuseppebilotta, 4 months ago to linux

Damn, upgrading to #Linux kernel 6.6.9 seems to have broken #Rusticl on my setup

amdgpu 0000:66:00.0: amdgpu: bo 000000008744cead va 0x0800000000-0x08000001ff conflict with 0x0800000000-0x0800000002

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

giuseppebilotta, 4 months ago

Ouch, this is not reproducible without #ROCm, which means it might just be ROCm not cleaning up properly after itself, or something else. Interesting, switching the order in which the platforms are initialized deadlocks ROCm instead of segfaulting #rusticl

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

harish, 5 months ago to Amd

So I bought a fancy #AMD graphics card because I didn‘t want to support the #Nvidia #CUDA hegemony. I also had high hopes for their supposedly more open drivers.

I am not sure if this was a great idea, because while it‘s been super good for my kids and their games, it‘s been a steep uphill climb (both ways) to get #ROCm and #HIP to do anything.

And the core bits are distributed as these precompiled packages that only work on a handful of specific versions of Linux distributions.

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

gabmus, 5 months ago to linux

@oblomov I've been asked today, is #ROCm (on #linux) any good these days? I haven't really been into the #gpgpu space for a while.

Also #askfedi

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ oblomov

mwibral, 4 months ago

@gabmus
I use #rocm 5.7 to run #opencl, google's #jax (for pymc), and #pytorch on two vega cards (Vega 64 and Radeon pro WX9100) on arch and ubuntu. They all run Ok, but correct setup needs some googling around, and jax beeds exporting some #xla flags. Situation is much, much better than 2 years ago, though.
@oblomov

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ oblomov

boilingsteam, 7 months ago to linux

ROCm is AMD's priority, executive says: https://www.eetimes.com/rocm-is-amds-no-1-priority-exec-says/
#linux #hardware #compute #rocm #amd

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

giuseppebilotta, 10 months ago (edited 10 months ago) to Amd

Anyway, as I mentioned recently, I have a new workstation that finally allows me to test our code using all three backends (#CUDA, #ROCm/‌#HIP and #CPU w/ #OpeMP) thanks to having an #AMD #Ryzen processor with an integrated #GPU in addition to a discrete #NVIDIA‌ GPU.
Of course the iGPU is massively underpowered compared to the high-end dGPU workhorse, but I would expect it to outperform the CPU on most workloads.
And this is where things get interesting.

reply

expand (17)

collapse (17)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ giuseppebilotta

giuseppebilotta, 10 months ago

So, one of the reasons why we could implement the #HIP backend easily in #GPUSPH is that #AMD provides #ROCm drop-in replacement for much of the #NVIDIA #CUDA‌ libraries, including #rocThrust, which (as I mentioned in the other thread) is a fork of #Thrust with a #HIP/‌#ROCm backend.
This is good as it reduces porting effort, but it also means you have to trust the quality of the provided implementation.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

giuseppebilotta, 10 months ago to random

While I was away for #SPHERIC, they delivered a couple of new workstations to the office, so today I spent some time setting one up, installing the OS, checking it matched the specs and testing the hardware. One thing I hadn't realized when we ordered this one is that in addition to the NVIDIA GPUs it would also give access to the integrated AMD‌ GPU. This is, I think, my first system where I have hardware from both vendors (although of very different class).

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

giuseppebilotta, 10 months ago

So obviously I took the opportunity to install both #CUDA and #HIP‌/‌#ROCm and make sure our software still built and ran correctly.
And honestly, it's unpleasant that in 2023 you still have to do some hoop jumping for either platform.
With CUDA, the issue is always making sure that you have a supported gcc version.
With ROCm, it's easy to trip on unsupported/untested hardware to find the right combination of env vars and define to make it go.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

gpuopen, 11 months ago to random

Need to run large MPI jobs across multiple #AMD GPUs?

Don't miss our latest #AMDlabnotes blog post covering #GPU-aware MPI with #ROCm support 👇

https://gpuopen.com/learn/amd-lab-notes/amd-lab-notes-gpu-aware-mpi-readme/?utm_source=mastodon&utm_medium=social&utm_campaign=amd-lab-notes

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ oblomov

giuseppebilotta, 1 year ago to random

Corporate #FLOSS at its worst: #NVIDIA controls the #Thrust library and its #CUDA, #OpenMP and #TBB backend. #AMD provides rocThrust, that is just Thrust with the CUDA part stripped and a new backend for #ROCm / #HIP. Nobody* is working on a backend for #SYCL
#Intel provides its own #oneAPI alternative as #oneDPL, which is NOT a drop-in replacement.

This is why we can't have nice things.

*there's a dead project here
https://github.com/wdmapp/syclthrust

reply

expand (9)

collapse (9)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ oblomov