pervognsen

@pervognsen@mastodon.social

Performance, compilers, hardware, mathematics, computer science.

I've worked in or adjacent to the video game industry for most of my career.

This profile is from a federated server and may be incomplete. Browse more on the original instance.

pervognsen, 2 days ago to random

Hundreds of rooks have moved into some trees outside my mom's apartment since I arrived. They start cawing around 4 AM. It's now barely 5 AM and I am wide awake.

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

The loud, constant cawing for 18 hours a day is a bit much but you start to get used to it after a few hours into each day and at this point I'm also feeling less paranoid about them planning a Hitchcock's Birds style massacre.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago (edited 2 days ago)

I guess they're too busy anyway: https://en.wikipedia.org/wiki/Rook_%28bird%29#Breeding

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Doomed_Daniel, 2 days ago to random

I'm not sure if everyone shares my fondness for X macros, but I sure like them:
https://github.com/DanielGibson/dhewm3/blob/a0bd8161445462daf6253da3ce9b7eeee5a64c89/neo/sys/sys_imgui.cpp#L516-L886

(that code writes and reads Dear ImGui styles to/from an ini-like text file)

reply

expand (17)

collapse (17)

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

@Doomed_Daniel By the way, my favorite variant of X macros is to use a separate .inl file that's parameterized and configured by a bracketed set of #defines by the includers. It often lets you get away with much cleaner syntax.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

@Doomed_Daniel At the very least it gets rid of the \ line continuations and also gives you proper file/line information in errors or whatever. Not worth it for small tables but I almost always do it as soon as the table is more than a relatively small handful of items.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

@Doomed_Daniel E.g. this kind of thing, https://gist.github.com/pervognsen/739839769c22653d154cd5cc0f1bda8e

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chandlerc, 2 days ago to random

C++ data structure API design question...

What are folks favorite ways to design a data structure that supports users providing two closely coupled custom functions? Why that pattern?

Specifically, imagine a hash table data structure that wants to allow users to deeply customize both the hash function and the equality comparison.

Current ideas, w/o ranking or even saying I like them, and interested in others:

A type parameter with static functions

two lambda template parameters

CRTP

reply

expand (23)

collapse (23)

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

@chandlerc Something to take into account--one of the most common reasons I don't want to use the type-based default for something like a hash table is when the query key isn't directly "of the same kind" as the stored item or key. So I don't usually want to customize the data structure, I want to customize the specific lookup/insert invocation, for example.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago (edited 2 days ago)

@chandlerc FWIW, I mostly don't care about non-type-based customization. It's a bit of ceremony to use newtype wrappers. While it's nice to avoid that, it's not usually a big deal. The bigger deal is getting the per-invocation customization right/wrong. E.g. HashMap<K, V>::get() in Rust's std only lets you customize the query key vs stored key comparison indirectly by the Borrow implementation. You can't implement it directly, only by projecting your &Q to a &K via Borrow (maybe impossible).

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago (edited 2 days ago)

@chandlerc hashbrown's HashMap API gets this more right by introducing its own Equivalent trait. Which has a blanket impl based on Borrow so the existing cases that worked with std's HashMap still transparently work here. Meanwhile you can implement Equivalent for a newtype wrapper around your query key to fully customize it, if needed. So it doesn't get rid of the newtype wrapper ceremony, but that's a few lines of code, not a capability or performance issue.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

@chandlerc Which is not to say I wouldn't want to reduce the ceremony if it can be done without cluttering the API in the common case, just that in the hierarchy of things that matter to me personally, it's in the bottom tier along with other things that are mostly just about saving on key strokes. Whereas some of those other API design mistakes related to customization can be game-over moments if they force me to make expensive temporary objects just to do a key lookup or whatever.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

@chandlerc Gotcha, just wanted to mention that since I've run into different variants of that problem I mentioned in basically every generic hash map library I've used, including in C++. And yeah, it's definitely easier in C++ on the capability side when you can just go crazy with template duck typing, as long as you still make the accommodations in the design of the API.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

regehr, 2 days ago to random

jfc is that an attempt at a beard or is he molding

reply

expand (11)

collapse (11)

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

@regehr Some of us (I'll include myself) are really not meant for beards.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

@regehr Definitely a lesson I learned early on. My dad got the idea that he should try to grow a beard at one point when he was already old (late 60s) and I didn't have the heart to tell him. Fortunately that was a short-lived experiment on his part.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

@regehr And in his defense, it didn't look anywhere near as bad as that photo. There's no excuse for that kind of hair growth except time spent in the wilderness after a plane crash.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

@regehr I've always wondered if "beard lifers" were happy or upset when beards made an overnight comeback in the last 15 years and everyone suddenly had beards. My recollection growing up (I'm born in 82) is that almost no adult ever had a beard especially in business or in most professions.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

@rygorous @regehr I've seen extreme versions of Reverse Hitler where someone could only grow facial hair at the corners of their top lip. You may think you had a bad case of Reverse Hitler but there's always someone out there with an even worse case of Reverse Hitler.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jrouwe, 2 days ago to random

Having fun with the new custom gravity feature for vehicles! #JoltPhysics

video/mp4

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

@jrouwe Is the gravity vector pulled from a nearest-neighbor query?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Donzanoid, 2 days ago to random

Are there any state of the art references for fast lexing/parsing on modern machines with SSDs and lots of RAM?

I'm interested in tight inner loops, good BTB use, no data stalls and minimal IO bottlenecks. Is the STB doc still the best reference?

@pervognsen any ideas?

reply

expand (11)

collapse (11)

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

@Donzanoid It's also worth starting with something even simpler and more traditional. Basically, the classic while (isspace(*s)) s++ kind of skip loop but with a table for the byte classification; also easy to simdify. This is basically going to cost one branch mispredict per "interesting event" if they're sufficiently sparse. If your average source line length is 50 bytes and you only have two "interesting events" per line on average, the mispredict overhead is in the range of <1 cycle/byte.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

@Donzanoid I know that's obvious stuff but it's worth considering how far something simple will get you, if only to establish a baseline so you can get an idea of how much of a return on investment there could be for something more complex.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

@Donzanoid Oh, another side note on why you might want to simdify that kind of short skip loop. Once you have it simdified plus maybe unrolled it a bit you've essentially clumped/quantized the loop tripcount so that the branch predictor will be much happier. This is a pretty standard simd trick for text processing and very low effort.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago (edited 2 days ago)

@Donzanoid @dominikg I consider the threshold for a pretty fast "normal" lexer to be around 0.5 GB/s, so if you want multiple orders of magnitude beyond that, you might be a bit too aggressive. I think 5 GB/s (1 byte/cycle) is a more realistic target for a skip lexer depending on the specifics of the problem. At that point you're also pretty close to the DRAM bandwidth limit for a symmetric all-core workload, e.g. my laptop has 64 GB/s DRAM bandwidth, with 8 cores that's 8 GB/s/core.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago to random

Informative article by @gfxstrand: https://www.collabora.com/news-and-blog/blog/2022/06/09/bridging-the-synchronization-gap-on-linux/

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

@jplebreton @gfxstrand Yeah, the new driver got me wanting to know what exactly was the underlying issue.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago to random

Downside of Denmark: I'd forgotten all ISPs block scihub, libgen.is, etc. Stuff like this is just a subsidy for all the scammy VPN providers.

reply

expand (8)

collapse (8)

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago

@madmoose Yeah, I found out my browser had been misconfigured (started using a new one recently) and wasn't using my normal DNS settings for some reason. Everything seems fine now.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 4 days ago (edited 4 days ago) to random

I assume this isn't a problem for EEs but for CS types who are taught logic gates, etc, in their curriculum I wonder if timing should be included in a first course. I'm still trying to help the person I mentioned earlier in a private chat and it sounds like that's the source of almost all their confusion. They think logic gates are instant and one of the "counterexamples" they came up for why delays seem logically inconsistent is y = xor(x, not(x)). Which is a standard edge detector.

reply

expand (18)

collapse (18)

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago (edited 2 days ago)

@abecedarius Yeah, I mean, I figured this out myself eventually. But even when constructivist-style learning is the goal (that's my own preference) you can lay out a more optimal path (e.g. guided projects, exercises, examples, etc, that are intended to maximize opportunities for self-made discoveries and insights) than just letting people get stuck with wrong mental models. And there's only a finite amount of time and letting people figure out everything "the hard way" isn't ideal either.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pervognsen, 2 days ago (edited 2 days ago)

@abecedarius I think programming is a uniquely good tool for stress testing your mental models across many different domains, though. You can hand-wave a lot of stuff to the point where you can fool yourself and others but once you're forced to implement something (e.g. a simulator for asynchronous digital circuits with combinational loops) you have to put it to a much harder test.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...