governa, to proxmox
@governa@fosstodon.org avatar

How to Passthrough NVIDIA GPU to VE 8 Containers for / Acceleration and Media Transcoding

https://linuxhint.com/passthrough-nvidia-gpu-proxmox-ve-8-cuda-ai-media-transcoding/

jay, to ai
@jay@mastodon.gamedev.place avatar
KathyReid, to linux
@KathyReid@aus.social avatar

Sure, faster, better and humanoid robots are cool, but have you ever installed on properly the first time around?

Yeah, me neither.

Focusing on technologies without making those technologies easier to obtain or easier to develop reinforces digital divides.

stib, to NixOS
@stib@aus.social avatar

Has anyone got to work on ? I can get my cards recognised by nvidia-smi, but cuda doesn't seem to be installed.

sos, to programming
@sos@mastodon.gamedev.place avatar
wagesj45, to weirdgirlmemes
@wagesj45@mastodon.jordanwages.com avatar

I hate when your run into an issue in your program, you google it, and zero results show up. :pepe_g:

ramikrispin, to python
@ramikrispin@mstdn.social avatar

Going Further with CUDA for Python Programmers 🚀

The second part of Jeremy Howard's lecture on for programmers is now available 👇🏼

📽️: https://www.youtube.com/watch?v=eUuGdh3nBGo

This lecture focuses on the following topics:
✅ Optimized Matrix Multiplication
✅ Shared Memory Techniques for CUDA
✅ Implementing Shared Memory Optimization
✅ Translating Python to CUDA and Performance Considerations
✅ Numba: Bringing Python and CUDA Together

Notebook: https://github.com/cuda-mode/lectures/blob/main/lecture5/matmul_l5.ipynb

Methylzero, to hpc

#HPC #CUDA #OpenCL #LAPACK
If you had to do a lot of linear least square solves, with potentially rank-deficient matrices, what would you use on a GPU? On CPUs, LAPACK's DGELSY does work, but most GPU libraries seem to not implement routines for rank-deficient matrices.

governa, to Amd
@governa@fosstodon.org avatar

Quietly Funded A Drop-In Implementation Built On ROCm: It's Now

https://www.phoronix.com/review/radeon-cuda-zluda

denzilferreira, to Amd
@denzilferreira@techhub.social avatar

ZLUDA, funded by AMD is bringing CUDA to a Radeon near you. ML/AI rejoice!


https://www.phoronix.com/review/radeon-cuda-zluda

el0j, to random

Nvidias moat under attack?

"AMD Quietly Funded A Drop-In CUDA Implementation Built On ROCm" -- https://www.phoronix.com/review/radeon-cuda-zluda

ramikrispin, to python
@ramikrispin@mstdn.social avatar

(1/2) Getting started with CUDA! 👇🏼

A new crash course for getting started with #CUDA with #Python by Jeremy Howard 🚀. CUDA is NVIDIA's programming model for parallel computing on GPUs. CUDE is being used by tools such as #PyTorch #tensorflow and other #deeplearning and LLMs frameworks to speed up calculations. The course covers the following topics:
✅ Setting up CUDA
✅ CUDA foundation
✅ Working with Kernel
✅ CUDA with PyTorch

Course 📽️: https://www.youtube.com/watch?v=nOxKexn3iBo

#datascience #machinelearning

ramikrispin,
@ramikrispin@mstdn.social avatar
jannem, to GraphicsProgramming
@jannem@fosstodon.org avatar

@VileLasagna Has a blog post on the relative speed of different #GPU compute frameworks on the same hardware and driver.

Tl;dr: on an #Nvidia card, with Nvidia drivers, #CUDA is the slowest, by far. Fastest is our old stalwart #OpenCL - almost twice as fast when used only for compute. #Vulcan is good, and the least affected by using the card for your desktop at the same time. Read it - it's good.

#HPC #gpgpu #compute

https://vilelasagna.ddns.net/coding/if-you-want-performance-maybe-you-should-drop-cuda/

slashtechno, to poetry
@slashtechno@fosstodon.org avatar

I've been facing many issues with using #Poetry (#pythonpoetry) with my #Python based #objectdetection project. I love Poetry for publishing packages, but think that #conda would be better since I have to deal with #CUDA and whatnot. Anyone familiar with a way to use pyproject.toml for publishing and building packages, even if Poetry isn't being used for dependency management?

For context, here's the project I'm working on: https://github.com/slashtechno/wyzely-detect

harish, to Amd
@harish@hachyderm.io avatar

So I bought a fancy graphics card because I didn‘t want to support the hegemony. I also had high hopes for their supposedly more open drivers.

I am not sure if this was a great idea, because while it‘s been super good for my kids and their games, it‘s been a steep uphill climb (both ways) to get and to do anything.

And the core bits are distributed as these precompiled packages that only work on a handful of specific versions of Linux distributions.

giuseppebilotta, to random
@giuseppebilotta@fediscience.org avatar

OK so I'm ready for today's lesson with the new laptop. My only gripe for the lesson will be that in 23.2 doesn't support information. Apparently the feature was merged at a later commit
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24101
and I even tried upgrading to my distro's experimental 23.3-rc1 packages, but trying to use rusticl on those packages segfaults. So either I've messed up something with this mixed upgrade, or I've hit an actual bug.

giuseppebilotta,
@giuseppebilotta@fediscience.org avatar

I'm still moderately annoyed by the fact that there's no single #OpenCL platform to drive all computer devices on this machine. #PoCL comes close because it supports both the CPU and the #NVIDIA dGPU through #CUDA, but the not the #AMD iGPU (there's an #HSA device, but). #Rusticl supports the iGP (radeonsi) and the CPU (llvmpipe), but not the dGPU (partly because I'm running that on proprietary drivers for CUDA). Everything else has at best one supported device out of three available.

Brett_E_Carlock, to random
@Brett_E_Carlock@mastodon.online avatar

Do I have anyone in my wider network with skills in programming CUDA, SYCL, and OpenCL?

We want to determine feasibility of migrating CUDA-only code to SYCL (via SYCLomatic?): OpenCV feature detection/extraction modules (SIFT, HAGOG, ORB, AKAZE).

The intent is to upstream all feasible work.

This, hopefully, should stand to benefit everyone instead of being limited to NVIDIA.

Currently in info gathering/people connecting phase, not yet funded & ready to go.

sri, to random
@sri@mastodon.social avatar

What an amazing talk by @airlied on the state of vendors, compute and community feedback. Please take the 45 minutes to watch - worth every minute! https://youtu.be/HzzLY5TdnZo

chrxh, to genart
@chrxh@mstdn.science avatar

After one more year of intensive work and numerous test runs, a new major update for https://github.com/chrxh/alien is finally polished and ready. It offers possibilities I had only dreamed of before. 🪐

YouTube: https://youtu.be/dSkxvi9igqQ

#ArtificialLife #generativeart #Cuda

video/mp4

schenklklopfer, to foss German
@schenklklopfer@chaos.social avatar

Hat jemensch Erfahrung mit von auf etwas, das besser ist als ?

Habe , suche .

pekka, to random

chipStar 1.0 released! It's a tool for compiling and running CUDA/HIP applications on SPIR-V-supported OpenCL or LevelZero platforms. v1.0 can already run various HPC applications correctly. See: https://github.com/CHIP-SPV/chipStar/releases/tag/v1.0

blaise, to homelab
@blaise@hachyderm.io avatar

question:

Should I add NVIDIA Tesla K40m 12GB GDDR5 Passive CUDA GPU accelerator to my server?
(Cisco UCS 220 m3, 128G )

Will it help with virtual terminal sessions?
Will it help with work loads that access the API?

giuseppebilotta, (edited ) to Amd
@giuseppebilotta@fediscience.org avatar

Anyway, as I mentioned recently, I have a new workstation that finally allows me to test our code using all three backends (, /‌ and w/ ) thanks to having an processor with an integrated in addition to a discrete ‌ GPU.
Of course the iGPU is massively underpowered compared to the high-end dGPU workhorse, but I would expect it to outperform the CPU on most workloads.
And this is where things get interesting.

giuseppebilotta,
@giuseppebilotta@fediscience.org avatar

So, one of the reasons why we could implement the backend easily in is that provides drop-in replacement for much of the ‌ libraries, including , which (as I mentioned in the other thread) is a fork of with a /‌ backend.
This is good as it reduces porting effort, but it also means you have to trust the quality of the provided implementation.

giuseppebilotta, (edited )
@giuseppebilotta@fediscience.org avatar

Turns out, the ecosystem is less mature than the one it emulates (unsurprising, giving how much more recent it is), and has obviously been tested much less in more exotic hardware configurations and with the wide variety of software and developers the CUDA libraries have had interactions with.
In the few days in which I've had the opportunity to play with it, I've already discovered two issues with it:

  • All
  • Subscribed
  • Moderated
  • Favorites
  • provamag3
  • magazineikmin
  • Youngstown
  • osvaldo12
  • khanakhh
  • slotface
  • tacticalgear
  • InstantRegret
  • ngwrru68w68
  • kavyap
  • DreamBathrooms
  • thenastyranch
  • everett
  • rosin
  • JUstTest
  • Durango
  • GTA5RPClips
  • ethstaker
  • modclub
  • mdbf
  • cisconetworking
  • Leos
  • normalnudes
  • cubers
  • megavids
  • tester
  • anitta
  • lostlight
  • All magazines