ProjectPhysX

@ProjectPhysX@mast.hpc.social

Summa cum laude Physics #PhD 🖖🧐🎓 | Graduate at EliteNet Bavaria 🧬 & DLR 🛰 | Developer of #FluidX3D #CFD 🌊 | Khronos #OpenCL Advisor 💻 | #GPU Wizard at #Intel 🟦

https://github.com/ProjectPhysX/FluidX3D

This profile is from a federated server and may be incomplete. Browse more on the original instance.

ProjectPhysX, 14 days ago to random

About an hour ago, the sky turned blood red over Bavaria, Germany. It's the first time I've ever seen an #Aurora in my life. It was so bright to be easily visible with the naked eye. An absolutely jawdropping experience. 🖖😳✨
#polarlights #northernlights #solarstorm
cc @tagesschau @DLR @dlr_next @stim3on

at 22:29 they got a lot brighter, covering half if the sky; even some green visible at the bottom; another 10s exposure
vertical 10s exposure at 22:37
before fading again, they moved westward in front of the moon, here an 10s exposure at 22:46 showing distinct lines shaped by the magnetic field

reply

expand (6)

collapse (6)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ NatureMC, elCelio, publicvoit, gracicot +46 more

ProjectPhysX, 14 days ago

@tagesschau @DLR @dlr_next @stim3on obligatory evidence photo that I really took these pictures^^ 🖖🧐

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ AnnaAnthro, oblomov

ProjectPhysX, 14 days ago

@tagesschau @DLR @dlr_next @stim3on 2 hours later, May 11 00:39-00:54, round 2. The #aurora coveres the entire sky over Bavaria, Germany. It's even visible to the naked eye when looking south! This is insanity!! 🖖😳
First image even looks like the camera Sensor of my phone got hit with radiation particles.
#polarlights #northernlights #Bavaria #Germany

10s exposure at 00:45, streamers dancing rapidly
10s exposure at 00:48
10s exposure at 00:54 looking east, the aurora covered the entire night sky over Bavaria

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ oblomov, joycebell, psoul

ProjectPhysX, 23 days ago to random

I found an interesting optimization for the marching-cubes algorithm today: Since vertex interpolation happens on axis-aligned edges of the unit cube, it's sufficient to interpolate in 1D instead of 3D. The faster interpolation makes the conditions for which edge to interpolate unnecessary, allowing to get rid of the edge table. That brings the implementation down to 73 lines, including the triangle table. 🖖🤠
https://github.com/ProjectPhysX/FluidX3D/commit/649fd40fa6270fbd0823a53b2a55f4194fc9510b#diff-464b1d19d4b616b9609031b48429081b2c215328d9f98bc5cbeac6b2b84fdbf3R456

reply

expand (5)

collapse (5)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ demofox

ProjectPhysX, 22 days ago

@nickserv that's a bug in #ARM's #OpenCL runtime: fused-multiply-add (fma) is somehow emulated with terrible performance. This is very similar to what @niconiconi found on Nvidia CMP 170HX, where fma was disabled in the driver.
I've just fixed this in #FluidX3D, by macro-replacing fma with a*b+c. Performance went up by 8-13x on my Samsung S9+ (ARM Mali-G72 MP18) with this workaround.
https://github.com/ProjectPhysX/FluidX3D/commit/9ce2caecfc85e4fda50fed3350304b75b223b06b
cc @chipsandcheese

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ niconiconi

ProjectPhysX, 17 days ago

@niconiconi @chipsandcheese @nickserv mad is equally slow as fma 🐌

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ProjectPhysX, 1 month ago to GraphicsProgramming

#FluidX3D #CFD update alert! v2.15 speeds up framerate in interactive graphics by 20-70%. 🖖🥳💻

How? Turns out iterating over 2 million pixels with a single CPU core is... really slow. I did that 3 times more than necessary for every frame rendered on screen! 🖖😆
I've now eliminated a memory copy of the frame (in favor of pointer swap), and a clear frame/zbuffer operation on CPU since that's already done on #GPU.

Release notes: https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.15

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ sascha

ProjectPhysX, 1 month ago to GraphicsProgramming

One of my #PhD papers got selected for the 2022 Best Paper Award of MDPI Computation! 🖖🥳📃🏆

That was a very bold publication for multiple reasons:

I solo-authored it

I wrote that paper in only 2 weeks

the title contains "Esoteric" twice

I submitted it on April 1st

It's serious science though: I discovered a simple algorithm to cut memory demand of the #LBM in half, allowing huge simulations on cheap #GPUs. This is one of the key innovations in #FluidX3D #CFD.

https://doi.org/10.3390/computation10060092

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

ProjectPhysX, 2 months ago to GraphicsProgramming

How realistic can a #CFD simulation be? Here is a 1 billion cell #FluidX3D simulation of an impacting raindrop, fully raytraced in 8K. FluidX3D contains state-of-the-art volume-of-fluid and surface tension models for highly accurate free surface simulations. Combined with my own #OpenCL #raytracing engine, results are rendered on-the-fly at resolution as large as remaining #GPU VRAM can hold. 🖖😋💧📺
https://youtu.be/MmLNQIW_Sic
FluidX3D is on #GitHub: https://github.com/ProjectPhysX/FluidX3D

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ oblomov, giuseppebilotta, wonka

fclc, 2 months ago to hpc

Ehhhh the newest big GPU has arrived!

And you can have two of them connected to Grace!

#B100 and #GB200 for the GPU itself and the #grace +2X GPU version

#hpc #GTC

reply

expand (14)

collapse (14)

report

activity

copy /kbin url

copy original url

open original url

Loading...

ProjectPhysX, 2 months ago

@fclc FP4 arithmetic on Blackwell... 🖖🤣
Here is all possible values of the glorious FP4 format:

0111 +Inf
0110 +NaN
0101 +NaN
0100 +NaN
0011 +2.0
0010 +1.0
0001 +0.5
0000 +0
1000 -0
1001 -0.5
1010 -1.0
1011 -2.0
1100 -NaN
1101 -NaN
1110 -NaN
1111 -Inf

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ tojiro, aeva

badlogic, 3 months ago to random

ML bubble, I need to spend some money and figured a little desktop machine with enough GPU power to train smaller models would be a fun thing to buy.

Suggestions? Full rig specs preferred! GPU wise there aren't many options other than A6000 and 4090 RTX it seems.

reply

expand (7)

collapse (7)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ eniko

ProjectPhysX, 3 months ago

@badlogic for 2 GPUs better go with a mainboard that supports PCIe bifurcation to two x8 slots to the CPU. Some Z690/Z790 mainboards support this, like the Taichi ones. And make sure the 4090s are only 3-slot, as all the 4-slot models will block the second PCIe slot.
For AI stuff, 3090 (non-Ti) will perform about the same but are cheaper, and 2x 100W less power.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ProjectPhysX, 3 months ago to linux

Software should always "just work". To make compiling #FluidX3D easier, I made the compile script smarter: it now automatically detects operating system (#Linux / #macOS / #Android), #X11 support on Linux, and if GNU make is installaled. 🖖🧐
https://github.com/ProjectPhysX/FluidX3D/commit/f990dfbe3f7a922d1cb6523e8e0b8e6d6cf8c905

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

azonenberg, 3 months ago to random

So the RTX 2080 Ti in my current office workstation is starting to get a little cramped for me. If I have a large KiCAD design and a ngscopeclient session with a lot of waveforms and filters open simultaneously, I often run out of VRAM.

More compute would be nice as long as it doesn't come with a power budget much higher than my existing 2080 Ti (250W TDP). I plan to stick with NVIDIA since I'm very familiar with their shader debug tools etc.

The only option with more VRAM in the consumer space is the RTX 4090 which I'd like to avoid due to the ludicrous 450W TDP and incompatible power connector.

So that leaves RTX workstation cards.

reply

expand (7)

collapse (7)

report

activity

copy /kbin url

copy original url

open original url

Loading...

ProjectPhysX, 3 months ago

@azonenberg Nvidia Ada has crippled VRAM bandwidth on all but the very expensive high end. Cheapest good option a 2nd hand 3090 (non-Ti) and undervolting it to reduce TDP.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ProjectPhysX, 3 months ago to GraphicsProgramming

#FluidX3D v2.13 is out, providing faster #VTK export with automatic SI unit conversion and a variety of bug fixes!
Full release notes: https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.13
#GPU #CFD #OpenCL #GPGPU #HPC #GitHub

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ oblomov

ProjectPhysX, 3 months ago to intel

This is wild: #FluidX3D can "SLI" together 🔵 #Intel Arc A770 + 🟢 #Nvidia Titan Xp, pooling 12GB+12GB of their VRAM for one large 450M cell #CFD simulation. Top half on A770, bottom half on Titan Xp. They seamlessly communicate over PCIe. Performance is ~1.7x of what either #GPU could do on its own. 🖖😋🖥🔥
#OpenCL shows its true power here - one implementation works on literally all GPUs at full performance, even at the same time. Happy #SimulationFriday!
https://youtu.be/PscbxGVs52o

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ dneto

ProjectPhysX, 4 months ago to Nvidia

Another day, another #Nvidia #GPU driver bug that needs a workaround: seems like Nvidia's #OpenCL driver suffers 32-bit uint overflow within the cl::CommandQueue::enqueueFillBuffer call! 🖖🤦‍♂️
https://github.com/ProjectPhysX/FluidX3D/commit/82976f15d2bd20b9188ea623cf0bac046c6c81ce

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ oblomov, giuseppebilotta

ProjectPhysX, 2 months ago

Found and reported another bug in #Nvidia #GPU drivers: passing vector types like int3 as #OpenCL kernel parameters is broken. 🖖🙂

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Techaltar, 4 months ago to random

The circle to search feature on the S24 series works so unreasonably well. Took this random 50x zoom photo, did a quick circle and right away got an answer.

This feature in particular uses Google for the actual searching, but even the completely self-developed Galaxy AI features worked surprisingly well. More thoughts in the Friday Checkout

video/mp4

reply

expand (11)

collapse (11)

report

activity

copy /kbin url

copy original url

open original url

Loading...

ProjectPhysX, 4 months ago

@Techaltar such online AI features are a nightmare for data privacy. Send all your photos straight to Google, hallelujah. And of what use is this phone when I can't even plug in headphones? 🚫🎧

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ProjectPhysX, 4 months ago

@Techaltar @champingsajt do you really think it performs the web search only upon request?
It looks more like it does it in the background for any new photo you take, sends all the data to Google, and caches the search URL so that when the request comes it instantly has the search result.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ProjectPhysX, 4 months ago to random

The final part of my #PhD thesis has now been accepted and published in #Microplastics and #Nanoplastics! 🖖🥳📃🎓
I'm proud to have coauthored this study by Lisa Marie Oehlschlägel. We looked at water-air transfer of microplastics during #bubble bursting in lab experiments, with surprising results:
https://doi.org/10.1186/s43591-023-00079-x 🌊🫧💥

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ giuseppebilotta

ProjectPhysX, 5 months ago to linux

#FluidX3D v2.11 is out! This update fully matches interactive graphics functionality and user interface between Windows and #Linux, and brings faster simulation startup time and bug fixes. 🖖😎💻
Full release notes: https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.11
#GPU #CFD #HPC #KDE

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ giuseppebilotta

giuseppebilotta, 6 months ago to random

OK so I'm ready for today's #GPGPU lesson with the new laptop. My only gripe for the lesson will be that #Rusticl in #Mesa 23.2 doesn't support #profiling information. Apparently the feature was merged at a later commit
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24101
and I even tried upgrading to my distro's experimental 23.3-rc1 packages, but trying to use rusticl on those packages segfaults. So either I've messed up something with this mixed upgrade, or I've hit an actual bug.

reply

expand (14)

collapse (14)

report

activity

copy /kbin url

copy original url

open original url

Loading...

ProjectPhysX, 6 months ago

@giuseppebilotta yes they report dual-CUs instead of CUs for some reason. Estimating TFlops/s of hardware based on reported CUs and clock frequency has required a table of device name fragments before already, cores/CU can be 0.5, 1, 8, 16, 64, 128, 192, 256.
https://github.com/ProjectPhysX/OpenCL-Wrapper/blob/master/src/opencl.hpp#L56

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ giuseppebilotta

giuseppebilotta, 6 months ago to Nvidia

This thread
https://mk.absturztau.be/notes/9lain2utf5untfm4
by @niconiconi is both fascinating and frustrating. #NVIDIA has a bad habit of doing market segmentation in software (there have been some infamous cases of NVIDIA releasing driver updates that just uncrippled their desktop GPU performance on new #AMD releases, bringing their performance on par with the equivalent workstation and server GPUs). I wouldn't be surprised if this were the case for the hardware @niconiconi is experimenting on.

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

ProjectPhysX, 6 months ago

@giuseppebilotta @niconiconi Nvidia have a long history of artificially crippling their GeForce lineup to not eat into the more profitable Quadro market. Either with drivers, or by motivating certain companies to make their "professional" software run like shit if the GPU name is not on the Quadro list. https://youtu.be/uwCu-b7htV8
I suspect similar marketing reasoning with software-crippling the mining cards. Software locks are a major root cause for e-waste unfortunately.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ giuseppebilotta

ProjectPhysX, 10 months ago to github German

#FluidX3D has passed 2000 Stars! It is the most popular #CFD software on #GitHub now! 🖖😊⭐️
https://github.com/ProjectPhysX/FluidX3D
Feeling blessed that my work is useful to so many people across the globe, with users in 75 countries already! 🌍
42% EU, 30% Americas, 25% Asia, 3% Oceania+Africa

reply

expand (4)

collapse (4)

report

activity

copy /kbin url

copy original url

open original url

Loading...

ProjectPhysX, 3 months ago

The red lightning bolt continues: #FluidX3D has passed 3000 Stargazers on #GitHub - from 82 countries! 🖖🥳⭐
Releasing this software for free really has turned out win-win: I've received so much valuable feedback, and answered with as many bug fixes and updates, with many more to come. I am enabling cutting-edge #CFD simulations for everyone, with very little hardware resources, on literally every computer that has a #GPU, regardless of vendor.
👉 https://github.com/ProjectPhysX/FluidX3D
#SimulationFriday #OpenCL

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ProjectPhysX, 3 months ago

@giuseppebilotta https://seladb.github.io/StarTrack-js/#/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...