giuseppebilotta, to GraphicsProgramming
@giuseppebilotta@fediscience.org avatar

Well, this is interesting.

Someone has posted an announcement for open and positions on the forum
https://gpusph.discourse.group/t/postdoc-ra-positions-at-oregon-state-university/207
Although they are not specifically about GPUSPH the software, they are about , and related topics (including wave modeling for ), so I think I'll leave the announcement up.

giuseppebilotta, to random
@giuseppebilotta@fediscience.org avatar

So for a month now we've had a new your researcher working with us on #GPUSPH simulations. After the first 10 days or so of onboarding she has started with her first hands-on experience writing a test case. This is always a very educational thing —for me. It really drives in how inadequate our documentation is 8-P

giuseppebilotta, to random
@giuseppebilotta@fediscience.org avatar

Related to my previous toot, it's amazing how impactful dimensionality is in determining workloads. This is something that in the abstract we all know, since 1D problems scale with n, 2D with n^2 and 3D with n^3. But at least for me it's still kind of amazing how impactful the difference is.

1/n

giuseppebilotta,
@giuseppebilotta@fediscience.org avatar

Again, this isn't “new”, but I still find myself surprised whenever it happens. Fun fact: #GPUSPH has been developed for the best part of the last 15 years to be 3D-only. Among other things, this meant that one had to be careful when pushing resolutions too much, since halving the inter-particle spacing meant approximately 8 times more particle: it's easy to get hundreds of millions of particle that way! And of course the timestep goes down by a factor of two as well, so …

6/n

giuseppebilotta,
@giuseppebilotta@fediscience.org avatar

… the total computational workload (iterations * particles) grew by a factor of 16 for every doubling of the resolution —and that was only for inviscid flows.

When I first introduced 2D support in #GPUSPH and finally got it to a point where I could actually run a simulation, I couldn't believe my eyes: «it can't be done already». And finding I could push it easily to resolutions of 1/128 or even 1/256 where 1/64 was already taking chances …

7/n

giuseppebilotta, (edited ) to random
@giuseppebilotta@fediscience.org avatar

I've been working on thermal support in #GPUSPH, and was finally at the stage where I could run some tests and check the results. Except that #ParaView was refusing to open my files, complaining about “invalid token”s. I just spent over half an hour trying to understand what I changed in my code that had broken the output, even though I haven't changed anything related to it recently … turns out it wasn't my problem, but an issue with an upgrade ParaView and libexpat:
https://discourse.paraview.org/t/i-cannot-read-a-vtp-file-i-could-open-yesterday-can-someone-try-to-open-it/13938/12

giuseppebilotta, to random
@giuseppebilotta@fediscience.org avatar

One of the reviewers for the manuscript on introducing CPU support in #GPUSPH asked for scalability tests on more than 8 cores (when I originally wrote the whole thing the only decent CPU I had at hand was an AMD Ryzen 7 3700X 8-Core Processor). It's a reasonable request, so I've been running tests on the new server we got at #INGV, that sports a dual AMD EPYC 7713 64-Core Processor. The most interesting so far has been that GPUSPH does seem to scale decently, but the baseline is lower.

#HPC

giuseppebilotta,
@giuseppebilotta@fediscience.org avatar

I never really expected the “quick hack” I did to run on CPU to be “state of the art” for any meaning of the word, and was actually pretty surprised myself by how good the results were with the relatively low error (confirmed my idea that good GPU designs are also good for multi-core CPUs though), and I most definitely don't expect it to scale optimally on a NUMA system with 64 cores per node (not even counting SMT here). What I'm surprised about here is the single-core performance.

giuseppebilotta, to Futurology
@giuseppebilotta@fediscience.org avatar

My institution () has a call open for an 18-month paid research internship (“assegno professionalizzante”) to work with and on in the modeling of plastic transport by water waves. The call ends on January 18 and is available here:

https://amministrazione-trasparente.ingv.it/web/trasparenza/papca-p/-/papca/display/3205320?p_auth=OJzYgL2s&p_p_state=pop_up

Feel free to contact me for details or circulate the call to your friends and colleagues if they are interested.

giuseppebilotta, to random Italian
@giuseppebilotta@fediscience.org avatar

Vi segnalo un bando per un professionalizzante presso la sezione di dell' () nell'ambito del progetto per lavorare con e su alla modellazione del moto ondoso in zona costiera per la dispersione di plastica. Se conoscete qualcuno a cui potrebbe essere interessare, fate girare.

https://amministrazione-trasparente.ingv.it/web/trasparenza/papca-p/-/papca/display/3205320?p_auth=OJzYgL2s&p_p_state=pop_up

giuseppebilotta,
@giuseppebilotta@fediscience.org avatar

Promemoria per il bando (scadenza il 18 gennaio) per un professionalizzate presso la sezione di dell' () nell'ambito del progetto per lavorare con e su alla modellazione del moto ondoso in zona costiera per la dispersione di plastica. Se conoscete qualcuno a cui potrebbe essere interessare, fate girare (dentro e fuori dal ).

https://amministrazione-trasparente.ingv.it/web/trasparenza/papca-p/-/papca/display/3205320?p_auth=OJzYgL2s&p_p_state=pop_up

giuseppebilotta,
@giuseppebilotta@fediscience.org avatar

Ultime ore per venire a lavorare con me all' participando a questo bando per un assegno professionalizzante, 18 mesi.

https://amministrazione-trasparente.ingv.it/web/trasparenza/papca-p/-/papca/display/3205320?p_auth=OJzYgL2s&p_p_state=pop_up

Argomento: simulazione del trasporto di sotto costa e ricerca di strategie di . Useremo il nostro , quindi anche per i curiosi di programmazione questa è un'ottima occasione.

giuseppebilotta, to Nvidia
@giuseppebilotta@fediscience.org avatar

This thread
https://mk.absturztau.be/notes/9lain2utf5untfm4
by @niconiconi is both fascinating and frustrating. has a bad habit of doing market segmentation in software (there have been some infamous cases of NVIDIA releasing driver updates that just uncrippled their desktop GPU performance on new releases, bringing their performance on par with the equivalent workstation and server GPUs). I wouldn't be surprised if this were the case for the hardware @niconiconi is experimenting on.

giuseppebilotta,
@giuseppebilotta@fediscience.org avatar

That being said, I'm still curious how would run on it, I should probably ask @niconiconi if they have some free time to give it a spin.

giuseppebilotta, to random
@giuseppebilotta@fediscience.org avatar

I'm sitting here trying to finish the presentation on to be presented at next week, and while the thing is “done” overall, I can't think of anything to put on the (final slide). I'm stymied.

giuseppebilotta,
@giuseppebilotta@fediscience.org avatar

Part of the issue is that the presentation is more of a showcase of what can do and how, so it's hard to find a classic bullet-point synthesis of the thing. I could go with «… and this is why it's awesome» but I'm not sure the audience has the sense of humor to take that the right way.

giuseppebilotta, (edited ) to Amd
@giuseppebilotta@fediscience.org avatar

Anyway, as I mentioned recently, I have a new workstation that finally allows me to test our code using all three backends (, /‌ and w/ ) thanks to having an processor with an integrated in addition to a discrete ‌ GPU.
Of course the iGPU is massively underpowered compared to the high-end dGPU workhorse, but I would expect it to outperform the CPU on most workloads.
And this is where things get interesting.

giuseppebilotta,
@giuseppebilotta@fediscience.org avatar

For reference, I'm testing this hardware primarily with two pieces of software: one is an internal cellular automaton model that we use for the assessment of lava flow invasion hazard, and the other is the I've already talked about. These two codebases are very different, and it's interesting to se how their differences impact the performance ratios I'm observing across the available hardware.

giuseppebilotta,
@giuseppebilotta@fediscience.org avatar

For comparison, with the discrete GPU is over 50× faster than the CPU, and that's on the low side of things, actually, due to many kernels being memory-bound rather than compute-bound, and no optimization attempts having been made yet to run on this hardware.

But, and this is where things get surprising, the performance of the iGPU drops, failing to even get 2× over the CPU.

Why would something more intense have lower performance ratio?

giuseppebilotta,
@giuseppebilotta@fediscience.org avatar

There's another important difference I haven't mentioned, between the CA and . All of the GPU code in the CA is “custom”, compute kernels written by yours truly. In GPUSPH, instead, there are a few instances where we rely on an external library: .

I've already complained a bit about how this affects us <https://fediscience.org/@giuseppebilotta/110283708975056091> especially in terms of backend support, but things are even worse, and I'll take the opportunity here to complain a bit!

giuseppebilotta,
@giuseppebilotta@fediscience.org avatar

So, one of the reasons why we could implement the backend easily in is that provides drop-in replacement for much of the ‌ libraries, including , which (as I mentioned in the other thread) is a fork of with a /‌ backend.
This is good as it reduces porting effort, but it also means you have to trust the quality of the provided implementation.

giuseppebilotta,
@giuseppebilotta@fediscience.org avatar

(That being said, if anyone wants to implement a sort-by-key and segmented reduction that don't depend on Thrust, and contribute it to , I'm not going to complain.)

giuseppebilotta, to random
@giuseppebilotta@fediscience.org avatar

I've been horribly busy these days with lots of trivial but time-consuming bureaucratic stuff, to the point I've been unable to work on at all. Worse, I haven't even started working on my presentation for (the material is ready, since the article for the proceedings has been submitted already, so it's really just a matter of building the presentation)

https://www.spheric2023.com/

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • kavyap
  • DreamBathrooms
  • cubers
  • magazineikmin
  • InstantRegret
  • GTA5RPClips
  • thenastyranch
  • Youngstown
  • rosin
  • slotface
  • osvaldo12
  • ngwrru68w68
  • ethstaker
  • provamag3
  • everett
  • Durango
  • Leos
  • cisconetworking
  • mdbf
  • khanakhh
  • normalnudes
  • tester
  • modclub
  • anitta
  • tacticalgear
  • megavids
  • lostlight
  • All magazines