I'm excited to go eat some lobster rolls... err, I mean that I am excited to announce that I will speaking at DevConf.US, August 14th - 16th, in Boston, USA 🦞
I will also be staffing an Ubuntu community booth as well alongside my fellow yinzer @AKernelPanic. If you're in the Boston-area and interested with helping with the booth, or picking up some limited-edition Noble Numbat swag, check the link below! Hope to see you there! 👇
We are sad to hear of the passing of Gordon Bell, a pioneer in high-performance and parallel computing and the visionary behind the ACM Gordon Bell Prize. His dedication to innovation inspired countless breakthroughs. Our deepest condolences to his loved ones.
In the process of debugging a NUMA first-touch problem, I accidentally found my simulation becomes significantly faster when it's running on garbage data without memset() - even on non-NUMA systems... What?! Does the kernel provide a fast-path for uninitialized memory that I've never heard of?
"a read from a never-written anonymous page will get every page copy-on-write mapped to the same physical page of zeros, so you can get TLB misses but L1d cache hits when reading it."
ICYMI: We're excited to announce the AlmaLinux community's newest Special Interest Group: The High Performance Computing and Artificial Intelligence SIG! 🎉
CERN trusts AlmaLinux as the base to offer access to non-CERN sites as well as for virtual machine and container images that will be distributed outside of CERN.
New to the Linux Australia #jobs board: the lovely folks at National Computational Infrastructure are looking for a #HPC#Linux administrator.
#NCI is the leading national provider of high-end computational and data-intensive services. It forms an integral part of the Australia Government’s #research#infrastructure strategy.
--
For more details, or to apply, please visit the listing on the ANU Jobs portal at:
#TIL C++26 is planning to add the whole BLAS into the standard library, and will natively support std::mdspan (the official multi-dimensional array since C++23). C++ surely is a programming language that people just throw everything imaginable into it. #hpc
Preliminary results of the CPU memory bandwidth micro-benchmark: If you're jumping in memory randomly, try doing number-crunching on at least 2-4 KiB of contiguous segments of data before jumping again to amortize the latency penalty down to an acceptable level (throughput is ~70%-80% compared to a sequential pattern). #hpc
It might just be that I'm more proficient analyzing and working around #GPU quirks (happens, when you do mostly #GPGPU for more than a decade) than #CPU, but there's so many weird things happening on this machine that I don't know where to start from.
Just to mention one: why is it that the performance per core when using #OpenMP drops by 40% when switching from 1 to 2 threads, but only when using OMP_PROC_BIND=close and not when using OMP_PROC_BIND=spread? If anything I'd expect the reverse.
And then adding more threads gives me almost perfect scaling, at least up to 16 threads, before dropping again … WTH is happening here? Honestly wouldn't mind some #fediHelp with suggestions on what to look at/for … #HPC#askFedi do your magic!