@zeux@mastodon.gamedev.place avatar

zeux

@zeux@mastodon.gamedev.place

Recovering SPU/RSX enthusiast and Vulkan apologist. Previously: technical fellow at Roblox.

pugixml, meshoptimizer, volk, niagara, Luau.

This profile is from a federated server and may be incomplete. Browse more on the original instance.

dotstdy, to random
@dotstdy@mastodon.social avatar

subgroup ops are a mindfuck on top of the existing mindfuck of work groups, local work groups, dispatches and invocations.

zeux,
@zeux@mastodon.gamedev.place avatar

@dotstdy What’s the difference between “local work groups” and “work groups”?

zeux,
@zeux@mastodon.gamedev.place avatar

@dotstdy @oblomov This is such a weird terminology given that you can’t observe anything about a “global work group”.

The reasonable hierarchy is: dispatch -> workgroup -> subgroup -> invocation. The word “local” feels superfluous - there’s no need to call any set of “local workgroups” smaller than a dispatch anything as it’s not visible via API.

zeux,
@zeux@mastodon.gamedev.place avatar

@dotstdy @oblomov (this hierarchy also usually directly maps to hardware)

zeux,
@zeux@mastodon.gamedev.place avatar

@dotstdy @oblomov I can’t tell from your original text if global workgroup is the same as dispatch or if dispatch is notionally split into an unspecified number of global workgroups… it’s bad. Don’t use it.

GLSL for example agrees with me.
https://registry.khronos.org/OpenGL-Refpages/gl4/html/gl_NumWorkGroups.xhtml

zeux,
@zeux@mastodon.gamedev.place avatar

@dotstdy @oblomov It’s easy to argue with it! But yes this makes it clear that that’s what they meant. I maintain that this is an unnecessary overloading of meaning of work group in the spec.

aras, to Playdate
@aras@mastodon.gamedev.place avatar

Because no one stopped me, I ported "Everybody Wants to Crank the World" demo to PC (Windows/Mac). https://github.com/aras-p/demo-pd-cranktheworld/pull/1 :playdate: :demoscene:

Using Sokol libraries by @floooh to do most of heavy lifting.

Fun fact: while the demo is running, it takes up as much CPU time as the windows task manager on my PC.

zeux,
@zeux@mastodon.gamedev.place avatar

@aras @floooh Nice! Works on iPhone except no audio.

zeux,
@zeux@mastodon.gamedev.place avatar

@floooh @aras Nope, no audio here either.

pervognsen, to random
@pervognsen@mastodon.social avatar

I've never fully worked out how best to articulate my dissatisfaction with the usual way people talk about pluggable allocators in systems programming. Sure, I'd like to have some standard for fallible, pluggable allocation at the lower level of a language's standard library. But the entire mindset of plugging together allocators and data structures is something I find dubious and at best it feels like a poor compromise.

zeux,
@zeux@mastodon.gamedev.place avatar

@pervognsen Recurring conversations with people who defend std::list / std::unordered_map with “but you can fix them if you just pass a fast pool allocator” and no, in fact you can’t.

zeux,
@zeux@mastodon.gamedev.place avatar
zeux,
@zeux@mastodon.gamedev.place avatar

@artificialmind @pervognsen Yeah I think that’s similar. One mistake in unordered-map + allocator is that composing independent primitives without pausing to think about the properties of the overall problem will never be great. Here the same effect applies: individually, a good hash map and a good vector are good solutions for specific problems, but you need to reason about the composite case by case.

zeux,
@zeux@mastodon.gamedev.place avatar

@artificialmind @pervognsen Coincidentally I wrote about a similar problem a little while ago https://zeux.io/2023/06/30/efficient-jagged-arrays/

zeux,
@zeux@mastodon.gamedev.place avatar

@pervognsen @artificialmind All problems have been solved in 1970s or earlier, it’s just that we don’t remember this 50 years later.

zeux,
@zeux@mastodon.gamedev.place avatar

@pervognsen @artificialmind I tried this once and this was not a good tradeoff because of branch mispredictions. In a loop even if you forget the trip count you only get one, and with this I saw significant divergence and visible net slowdown.

zeux,
@zeux@mastodon.gamedev.place avatar

@pervognsen @artificialmind (you could only place tombstones at the end of course, only commenting on order preserving variants)

pervognsen, to random
@pervognsen@mastodon.social avatar

I haven't done any real Vulkan programming since 1.0. Are there any good guides that skip all the legacy junk and only show the streamlined 1.3 way of doing things?

zeux,
@zeux@mastodon.gamedev.place avatar

@pervognsen 1.3 isn’t that different really in terms of things you’re guaranteed to have OTOH. Dynamic rendering and synchronization2, maybe uniform layout. So as stated you aren’t missing much; if you want a drastically different view you need things like bindless (descriptor indexing/buffers), mesh shaders, ray tracing, but all of these are optionally supported everywhere.

A core 1.3 renderer isn’t as drastic of a departure from core 1.0, even though you get a bunch of small nice things.

zeux,
@zeux@mastodon.gamedev.place avatar

@pervognsen 1.3 is supported everywhere where 1.0 is assuming latest drivers (in theory). If you don’t target mobile I would definitely go bindless only for anything new but that has no relation to 1.3.

danluu, (edited ) to random
@danluu@mastodon.social avatar

I can't quite put my finger on it, but there's something delightful about this list of "legitimate" uses of negative literals:

https://github.com/elm/compiler/issues/1773.

I think part of it is the circumstances that would compel users to construct such a list. Until that thread, it hadn't even occurred to me that someone would present a case against the existence of negative literals that required a rebuttal.

zeux,
@zeux@mastodon.gamedev.place avatar

@NohatCoder @soulthreads @danluu @pervognsen My (hot?) take is that two different ways to declare a const / mutable variable (Rust, Zig, TypeScript, Scala) or insistence on adding const to all locals (modern C++) is simply counterproductive and doesn’t improve code much. Local mutability is just fine actually.

zeux, to random
@zeux@mastodon.gamedev.place avatar

meshoptimizer is at 5000 stars!

I have a growing list of things I'd like to tackle both in the core library and in gltfpack; I intend to spend more time on meshoptimizer/gltfpack in the coming months.

Was hoping to get a grant from Epic MegaGrants for maintenance for this year but they declined my application so I guess we'll just yolo this :)

Thank you everyone for using the project!

https://github.com/zeux/meshoptimizer

pervognsen, (edited ) to random
@pervognsen@mastodon.social avatar

How much RAM do you have in your dev workstation/laptop?

zeux,
@zeux@mastodon.gamedev.place avatar

@pervognsen 192 GB :) (… and 16GB in the travel laptop)

pervognsen, to random
@pervognsen@mastodon.social avatar

Days like today I look at my food log and realize being a fruitarian wouldn't be half bad.

zeux,
@zeux@mastodon.gamedev.place avatar

@pervognsen @BartWronski For me small deficit if your baseline is notably higher vs goal ends up fairly demoralizing: your sustained progress is less visible amidst natural variation, and it’s easy to mess up the counting to be off by a large margin of target deficit. I’ve been trying to do small increments over last 6 months and just doesn’t seem to work long term for me, so trying a larger deficit and more control now.

(I’m also slowly increasing strength simultaneously so a narrow balance…)

pervognsen, to random
@pervognsen@mastodon.social avatar

The Brothers Lionheart is isekai.

zeux,
@zeux@mastodon.gamedev.place avatar

@pervognsen Karlsson was huge in USSR and it carried over after the collapse.

zeux,
@zeux@mastodon.gamedev.place avatar

@Doomed_Daniel @pervognsen Pippi was known but much less popular I think. All other books and characters basically not known.

castano, (edited ) to random
@castano@mastodon.gamedev.place avatar

Let’s say I discover a significant shader optimization for one IHV that’s easy to leverage and people don’t seem to be aware of.

zeux,
@zeux@mastodon.gamedev.place avatar

@castano What does "trade it with the IHV" mean? Trade for what? :)

I resolve questions like this by asking, if I assume there's many people asking the same questions and making the same decision, which world would I rather live in?

zeux,
@zeux@mastodon.gamedev.place avatar

@castano It's unfortunate that IHVs keep information critical to getting good performance on their hardware secret from the ISVs who's goal is literally to make that hardware go faster. But it's entirely possible they are aware of this and a multitude of hidden factors prevent it.

zeux, to random
@zeux@mastodon.gamedev.place avatar

New* post, "target_clones is a trap"! https://zeux.io/2024/04/20/target-clones-trap/ Boosts appreciated :)

This was written in 2022 as I was working on Luau optimizations; 2024 is a great year to repost it as the mechanism behind target_clones (ended up not being used) also was used in xz_utils backdoor!

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • cisconetworking
  • thenastyranch
  • GTA5RPClips
  • everett
  • Durango
  • rosin
  • InstantRegret
  • DreamBathrooms
  • magazineikmin
  • Youngstown
  • mdbf
  • slotface
  • ethstaker
  • megavids
  • kavyap
  • normalnudes
  • modclub
  • cubers
  • ngwrru68w68
  • khanakhh
  • tacticalgear
  • tester
  • provamag3
  • Leos
  • osvaldo12
  • anitta
  • lostlight
  • All magazines