High-Performance Computing

ACM, 3 days ago

We are sad to hear of the passing of Gordon Bell, a pioneer in high-performance and parallel computing and the visionary behind the ACM Gordon Bell Prize. His dedication to innovation inspired countless breakthroughs. Our deepest condolences to his loved ones.

#InMemoriam #HPC

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ CerstinMahlow, dgoldsmith, GhostOnTheHalfShell, acdha +2 more

HPC_Guru, 19 days ago

Robert Dennard, the inventor of DRAM, and famous for Dennard scaling, died on April 23 at the age of 91

#HPC

https://lohud.com/obituaries/pnys0809210

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ krinkle, drahardja, dgoldsmith

niconiconi, 19 days ago

In the process of debugging a NUMA first-touch problem, I accidentally found my simulation becomes significantly faster when it's running on garbage data without memset() - even on non-NUMA systems... What?! Does the kernel provide a fast-path for uninitialized memory that I've never heard of?

"a read from a never-written anonymous page will get every page copy-on-write mapped to the same physical page of zeros, so you can get TLB misses but L1d cache hits when reading it."

Yes... #hpc

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ robryk

BenjaminHCCarr, 22 days ago

#AlmaLinux Forms An #SIG to advance interests around high performance computing (#HPC) and artificial intelligence (#AI) for this #RHEL-derived #operatingsystem.
The leader of this new AlmaLinux SIG is #HaydenBarnes as the #OpenSource Community Manager for AI at #HPE.
https://www.phoronix.com/news/AlmaLinux-HPC-AI-SIG

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ passthejoe

civodul, 22 days ago French

Les vidéos du forum ORAP sur #HPC & #reproductibilité de mars sont en ligne 👇
http://orap.irisa.fr/52ieme-forum-reproductibilite/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hpcnotes, 24 days ago

Experiment comparing reactions across social media platforms ...

One or more of #Fortran or #FP64 or #onprem #hpc will be obsolete by 2030.

Ignoring this message counts as agreeing :-)

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

fclc, 24 days ago

@abarker @hpcnotes Depending on who you ask, C is already obsolete, and barely hanging on.

(CC: @thephd )

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

thephd, 22 days ago

@fclc @abarker @hpcnotes Can't really say when C will be obsolete. I'm certainly preventing it as best as I can, but...

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

boegel, 1 month ago

Miquel Pericàs (Chalmers University of Technology) did a great job during his keynote presentation at the 9th EasyBuild User Meeting in Sweden.

RISC-V is coming, and the European #HPC community is working hard to prepare for it via projects like EUPILOT & co.

https://easybuild.io/eum24/#program

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

linuxaustralia, 1 month ago

New to the Linux Australia #jobs board: the lovely folks at National Computational Infrastructure are looking for a #HPC #Linux administrator.

#NCI is the leading national provider of high-end computational and data-intensive services. It forms an integral part of the Australia Government’s #research #infrastructure strategy.

--
For more details, or to apply, please visit the listing on the ANU Jobs portal at:

🔗 http://jobs.anu.edu.au/cw/en/job/555156

#fedihired #fedijobs #fedihiring

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ onepict, KathyReid

niconiconi, 2 months ago

Preliminary results of the CPU memory bandwidth micro-benchmark: If you're jumping in memory randomly, try doing number-crunching on at least 2-4 KiB of contiguous segments of data before jumping again to amortize the latency penalty down to an acceptable level (throughput is ~70%-80% compared to a sequential pattern). #hpc

reply

expand (12)

collapse (12)

report

activity

copy /kbin url

copy original url

open original url

Loading...

azonenberg, 2 months ago

@niconiconi @ignaloidas Going back to my claim from years ago:

DRAM is a block device. Change my mind.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

whitequark, 2 months ago

@azonenberg @niconiconi @ignaloidas well it's definitely not a character device so what else could it be? 😇

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

janekdererste, 2 months ago

Love to #visualize stuff about our model. Here we have expected computational load for nodes in our simulation network.

I am actually doing something else, but wanted to see, how this is distributed in our model.

#hpc #programming #traffic #modelling #matsim

reply

expand (8)

collapse (8)

report

activity

copy /kbin url

copy original url

open original url

Loading...

janekdererste, 2 months ago

@asltf @thijs_lucas Das Simulationsprogramm heißt MATSim und kann im Prinzip alle möglichen Verkehrsträger berechnen. Im gezeigten Bild sind Auto, Fracht, Fahrrad und zu Fuß enthalten.

Normalerweise haben wir auch noch öffentlichen Verkehr mit drin, hier aber nicht, weil es eher um den Informatikaspekt und weniger um die Simulationsstudie geht.

Wir können dann auch noch Dinge wie Demand Responsive Transport (Taxi, Uber) abbilden.

Das Ursprungsmodell gibts hier:
https://github.com/matsim-scenarios/matsim-berlin

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

prefec2, 2 months ago

@janekdererste @asltf @thijs_lucas wie cool ist das denn? Ein Verkehrssimulator in Open-Source. 😁

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fclc, 2 months ago

Ehhhh the newest big GPU has arrived!

And you can have two of them connected to Grace!

#B100 and #GB200 for the GPU itself and the #grace +2X GPU version

#hpc #GTC

reply

expand (14)

collapse (14)

report

activity

copy /kbin url

copy original url

open original url

Loading...

ProjectPhysX, 2 months ago

@fclc FP4 arithmetic on Blackwell... 🖖🤣
Here is all possible values of the glorious FP4 format:

0111 +Inf
0110 +NaN
0101 +NaN
0100 +NaN
0011 +2.0
0010 +1.0
0001 +0.5
0000 +0
1000 -0
1001 -0.5
1010 -1.0
1011 -2.0
1100 -NaN
1101 -NaN
1110 -NaN
1111 -Inf

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ tojiro, aeva

rygorous, 2 months ago

@steve @amonakov @ProjectPhysX @fclc @fay59 I will say that it was interesting talking to mobile GPU compiler devs in the early 2010s where GLSLs FP rules boiled down to "don't be evil" on a coffee-stained napkin as a former PC GPU shader compiler dev where the requirements for our FP environment were quite a bit more nailed down, specifically the now public https://microsoft.github.io/DirectX-Specs/d3d/archive/D3D11_3_FunctionalSpec.htm#3.1%20Floating%20Point%20Rules

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ aeva

HPC_Guru, 2 months ago

Which much-hyped programming language had the least adoption?

Not a #HPC question. Any programming language.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hyc, 2 months ago

@HPC_Guru Prolog?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

HPC_Guru, 2 months ago

64K Kernel Page Size Performance Benefits For #HPC Refreshed

This round includes NVIDIA's GH200, along with the AMD & Intel CPUs

Linux 6.8 kernel performance with a 64K page size improved on average by about 15%

https://www.phoronix.com/review/aarch64-64k-kernel-perf

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ hyc

HPC_Guru, 2 months ago

@feld My understanding is that to get a larger page size than 64K in Linux, one would have to enable huge pages

Huge pages can use larger memory blocks (e.g., 2MB or 1GB)

https://blog.netdata.cloud/understanding-huge-pages/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

HPC_Guru, 3 months ago

#ISC24 Registration Now Open

Take advantage of early-bird pricing by registering before March 27

https://isc-hpc.com/registration-2024.html

#HPC #AI

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ACM

HPC_Guru, 3 months ago

Intel CEO Pat Gelsinger: I hope to build chips for Lisa Su and AMD

Goal is to be the foundry for the world and that includes competitors

https://tomshardware.com/pc-components/cpus/intel-ceo-pat-gelsinger-i-hope-to-build-chips-for-lisa-su-and-amd

#HPC #AI

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ bentomn, ACM

Methylzero, 3 months ago

#HPC #CUDA #OpenCL #LAPACK
If you had to do a lot of linear least square solves, with potentially rank-deficient matrices, what would you use on a GPU? On CPUs, LAPACK's DGELSY does work, but most GPU libraries seem to not implement routines for rank-deficient matrices.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ oblomov

niconiconi, 3 months ago

Memory Bandwidth Is All You Need #hpc

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ignaloidas, 3 months ago

@niconiconi If only...

I've been looking into ways to accelerate my SAT solving stuff, and there just isn't an easy hardware out to take...

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rupdecat, 3 months ago

Everything set up. Waiting for students.

First round of the #Snakemake for #HPC users course (in preparation).

#OpenScience and #Reproducibility do demand efforts! Teaching is important!

I am so excited. Like it's the first time in a classroom ... 😊

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ TEG

civodul, 3 months ago

Attending the #HPC talk by my colleague Philippe Swartvagher, showcasing #Guix (and more!) as a foundation for #ReproducibleResearch workflows. Wo0t!
https://fosdem.org/2024/schedule/event/fosdem-2024-2651-making-reproducible-and-publishable-large-scale-hpc-experiments/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ janneke

fclc, 3 months ago

This is an @dougall appreciation post, his SVE instruction visualizer is great https://dougallj.github.io/asil/ #aarch64 #sve #hpc #arm

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ tcely

niconiconi, 3 months ago

ProTip: Do NOT set OMP_PROC_BIND and OMP_PLACES globally. At least for GOMP, it breaks multiprocessing in many non-OpenMP applications. I wrote them into /etc/environment then I started wondering why Gentoo's code compilation is only able to use 1 CPU core. :woozypad: #hpc

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

azonenberg, 3 months ago

@niconiconi Reminds me that OMP_WAIT_POLICY has to be PASSIVE for ngscopeclient to work properly and I never figured out why. I think it has to do with multiple threads spawning OpenMP tasks at once and getting confused, idk.

I'm gradually moving the project away from OMP and towards application-managed threading or GPU processing anyway so that might end up being the fix.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

civodul, 3 months ago

📺 Videos of the Nov. 2023 Workshop on Reproducible Software Environments for Research and High-Performance Computing are on-line!
https://hpc.guix.info/events/2023/workshop/program/

Videos include short interviews with the speakers. Tutorial material is also available from that page.

Many thanks to the speakers and to the video team at Institut Agro!

#ReproducibleResearch #OpenScience #HPC #Guix

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ daviwil, janneke

HPC_Guru, 4 months ago

Today, Intel celebrated the opening of Fab 9 in Rio Rancho, New Mexico

The milestone is part of Intel's $3.5B investment to equip its New Mexico operations for the manufacturing of advanced semiconductor packaging technologies including Foveros

https://www.intel.com/content/www/us/en/newsroom/news/intel-opens-fab-9-new-mexico.html

#HPC

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ HistoPol

niconiconi, 4 months ago

Finyally understood how to generalize the diamond tiling algorithm from 1D+1T spacetime to 2D+1T spacetime for stencil code. This mysterious diagram now makes perfect sense. But to the researchers at Keldysh Institute of Applied Mathematics, this algorithm was already obsolete for 20 years. It was already in use in Russian HPC code of the late 1990s and was known as "ConeTur". In the mean time, they invented 3 generations of newer algorithms, which are even more incomprehensible. Keldysh is at least 10 years ahead of the rest of the world... #hpc

reply

expand (6)

collapse (6)

report

activity

copy /kbin url

copy original url

open original url

Loading...

gorplop, 4 months ago

@niconiconi oh i get it now, this is pretty neat! thanks :)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

niconiconi, 4 months ago

@gorplop A related field is polyhedral compilers, which can automatically do loop transformations in this manner, and is the mainstream of research today. GCC's Graphite optimizer is an example, and there are many HPC-specific ones. The idea is that they are general-purpose, given an unmodifed loop, they can apply extremely complicated patterns beyond human understanding. But the heavy focus on automatic code generation means that if your code doesn't match the pattern the compiler already know, it likely won't perform very well. On the other hand, this Keldysh research team's algorithms are designed by hand for specific algorithms and all have geometric interpretations, and are meant for for human use - which appears to be rarely studied or implemented today by anyone else as they're just too difficult to reason.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

turniphead, 4 months ago

Hello!

Having a second attempt at a tech focused account. Previous was at a server that was too small, and missed a lot of interesting stuff.

By day I'm a software developer, mainly working in Go on #hpc and #containers stuff.

Partial to old Sun / SGI stuff, and the Atari ST. Have a love-hate relationship with #Linux and #FOSS

My main for non-tech stuff (homebrewing, music, Cornwall) is @dctrud

#introduction

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hyc, 4 months ago

@turniphead @dctrud I wonder how many Atari ST fans are still around

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

turniphead, 4 months ago

@hyc well... I'm not far into my 4th decade, so hopefully plenty of us are still around :-)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...