@pervognsen@mastodon.social avatar

pervognsen

@pervognsen@mastodon.social

Performance, compilers, hardware, mathematics, computer science.

I've worked in or adjacent to the video game industry for most of my career.

This profile is from a federated server and may be incomplete. Browse more on the original instance.

pervognsen, to random
@pervognsen@mastodon.social avatar
unormal, to random
@unormal@mastodon.social avatar

Elden ring is better now that I got good, but I still like ac6 better, elden ring is literally too big, like consecrated snowfield isnt even good, and everything past limgrave would be better denser instead of stretched like taffy. Still an amazing game

pervognsen,
@pervognsen@mastodon.social avatar

@unormal Elden Ring was the second best first playthrough I've had of any FromSoftware game but it's also the game of theirs I've had the least enthusiasm for replaying. It's too much dang game.

pervognsen, to random
@pervognsen@mastodon.social avatar

Today I learned that "cyberphysical systems" is a real word.

pervognsen, (edited ) to random
@pervognsen@mastodon.social avatar

It's cool to see several of Niko's ideas in past blog posts come together in this unified vision. https://smallcultfollowing.com/babysteps/blog/2024/06/02/the-borrow-checker-within/

pervognsen, to random
@pervognsen@mastodon.social avatar

I switched from Windows to Linux not directly because of the most recent shit show with Recall (though it's part of the general trend). But it sure does make me feel even better about making that choice when I'm struggling with some minor annoyance on Linux.

pervognsen,
@pervognsen@mastodon.social avatar

@SonnyBonds I'm just using vanilla Arch.

pervognsen, to random
@pervognsen@mastodon.social avatar

It appears that C has a symmetric requirement on the minimum magnitude (32767) of INT_MIN and INT_MAX. I wonder if there are still any 16-bit ones' complement machines humming along in some dark corner of the world.

pervognsen,
@pervognsen@mastodon.social avatar

@seanmiddleditch I remember they were talking about that change but I didn't know it had been ratified. And you're right. I just looked up the limits in the Working Draft.

harold, to random
@harold@mastodon.gamedev.place avatar

Looks like I had forgotten to post the AVX512 histogram tech to mastodon (I had some back-and-forth about it on twitter with Pete Cawley), so here it is for the other audience. If you follow me on twitter you've already seen it, there had been no significant changes recently.

Shitty benchmark results (not super rigorous) included for some idea of how fast it is on a 11600K

https://gist.github.com/IJzerbaard/97a831715ecf8bb3e994c76e5f266d8e

pervognsen,
@pervognsen@mastodon.social avatar

@harold That Gist doesn't seem to include the definition of FA. I assume it's a (bog standard) full adder/carry-save adder macro?

pervognsen, (edited )
@pervognsen@mastodon.social avatar

@harold It looks like hist_8_64 is the best performing (better than hist_8_128) according to the benchmark tables on the Turbo-Histogram repo. Is that not the case on your machine?

Anyway, impressive job! It's always nice to see both CSAs and counter flushing get a workout like this. Do you have an AVX2 baseline with the same algorithm structure?

pervognsen,
@pervognsen@mastodon.social avatar

@harold Maybe I'll try an AVX2 implementation at some point. I was just curious how much was AVX512 vs AVX2 vs the algorithm structure, in case you'd already measured that.

dotstdy, to random
@dotstdy@mastodon.social avatar

lmao

pervognsen,
@pervognsen@mastodon.social avatar

@dotstdy A plea for a return to Majordomo. I will also accept Listserv as a compromise. https://en.wikipedia.org/wiki/Majordomo_(software)

pervognsen,
@pervognsen@mastodon.social avatar

@dotstdy It looks like that is abandonware. To be honest I can kind of respect the insanity of someone starting a Perl-based rewrite effort in 2016.

pervognsen, to random
@pervognsen@mastodon.social avatar

I was answering a question on the subreddit and just noticed the Rust book still doesn't cover std::thread::scope, only std::thread::spawn. That's not ideal. It looks like this was added to stable about 2 years ago in 1.63. Having the Rust Foundation pay someone for a basic yearly update pass seems like a good use of funds if you're going to use this book as your recommended introduction for new users.

pervognsen, to random
@pervognsen@mastodon.social avatar

In this episode of "hummus goes with everything", I had some leftover hummus and nothing but Danish rye bread and can report it works well as a spread in the same vein as leverpostej.

pervognsen,
@pervognsen@mastodon.social avatar

@zeux I guess I just got lucky; I didn't realize this was a standard combo.

madmoose, to random
@madmoose@mastodon.social avatar

From “Captain Blood” by ERE Informatique/Exxos (1998).

Yet another little RLE compression algorithm – the EGA planes appear to be stored in the opposite order from Purple Saturn Day.

image/jpeg
image/jpeg
image/jpeg

pervognsen,
@pervognsen@mastodon.social avatar

@madmoose I think you mean 1988.

pervognsen, to random
@pervognsen@mastodon.social avatar

I stopped paying attention to process nodes around 12 nm but I just noticed Arm Cortex X925 is advertised as designed for a 3 nm process. I'm assuming this has even less to do with lambda than it once did?

pervognsen,
@pervognsen@mastodon.social avatar

@amonakov @dotstdy Clearly what they need is a non-linear asymptotic scale so they can keep juicing those numbers forever. They're already heading down that path.

dpiponi, to random
@dpiponi@mathstodon.xyz avatar

Quinta da Regaleira in Sintra, near Lisbon. But which work of science fiction (or is it weird fiction?) am I thinking of? The author mentions that this was a significant source of inspiration.

pervognsen,
@pervognsen@mastodon.social avatar

@dpiponi (Aside: The recent-ish remake is excellent while being faithful to the aesthetics of the original.)

shachaf, to random
@shachaf@y.la avatar

Is there an implementation of a mutex lock that only uses store-release/load-acquire, with no full memory barriers and no RMW operations?

pervognsen,
@pervognsen@mastodon.social avatar

@shachaf Dekker's algorithm requires sequential consistency, right? It seems like if there was something like Dekker's without that requirement we would all have heard about it.

pervognsen,
@pervognsen@mastodon.social avatar

@shachaf You can't do a disproof within Herlihy's framework for consensus numbers (the concurrency model is sequentially consistent) but thinking about Herlihy got me to google "Herlihy consensus weak memory ordering" which brought up this 2024 paper which looks very relevant: https://arxiv.org/abs/2405.16611

pervognsen,
@pervognsen@mastodon.social avatar

@shachaf They reference this earlier paper which seems to have the negative result: https://www.sri.inf.ethz.ch/publications/attiya2011laws

"We prove that one cannot avoid the use of either [RAW or AWAR]. Unfortunately, enforcing RAW or AWAR is expensive on all current mainstream processors. To enforce RAW, memory ordering–also called fence or barrier–instructions must be used. To enforce AWAR, atomic instructions such as compare-and-swap are required."

nh, to random
@nh@mastodon.gamedev.place avatar

I wonder if compilers could meaningfully benefit from smart cache blocking.

LLVM has function pass managers. The idea being that we keep a single function in cache while running many passes on it, before then doing the same on the next function etc.

This makes sense because compilers tend to be pointer chase nightmares. You want that sweet L1 cache hit latency.

But compilers also tend to be branch nightmares. What if you have many, many tiny functions?

pervognsen, (edited )
@pervognsen@mastodon.social avatar

@nh Another example like this is where you receive messages (commands/requests) of different types on a single channel and instead of processing them immediately you batch them (with a combination of a batch size and time cut-off to bound the latency) in separate buckets by message type. Of course this assumes (as does the relocation example) that you can process messages of different types out of order (although you can flush the buckets early if you only need ordering for specific messages).

contextfree, to random
@contextfree@hachyderm.io avatar

I really really really wish wasm had a dup instruction.

pervognsen,
@pervognsen@mastodon.social avatar

@contextfree In general, it seems that Wasm really wants you to use local variables instead of the value stack as soon as you're past the most basic use cases (single-use values in this case). And since Wasm engines have to be designed around this assumption, there's probably not much reason to avoid locals when you generate Wasm.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • mdbf
  • ngwrru68w68
  • modclub
  • magazineikmin
  • thenastyranch
  • rosin
  • khanakhh
  • InstantRegret
  • Youngstown
  • slotface
  • Durango
  • kavyap
  • DreamBathrooms
  • megavids
  • GTA5RPClips
  • tacticalgear
  • normalnudes
  • tester
  • osvaldo12
  • everett
  • cubers
  • ethstaker
  • anitta
  • provamag3
  • Leos
  • cisconetworking
  • lostlight
  • All magazines