There's a bit of stuff in this article which phrases it in terms of changes over time, e.g. compute capability has grown and we no longer need big data. But it seems closer to reality that it was never required, and continues to not be required. (looking forward to the same style of post happening in a few years vis-a-vis microservices)
@zeux@oblomov Well it's hard to argue with the vulkan spec on matters of terminology, no matter how ridiculous it is.
> [vkCmdDispatch] When the command is executed, a global workgroup consisting of groupCountX × groupCountY × groupCountZ local workgroups is assembled.
> Decorating a variable with the GlobalInvocationId built-in decoration will make that variable contain the location of the current invocation within the global workgroup.
@dotstdy@oblomov It’s easy to argue with it! But yes this makes it clear that that’s what they meant. I maintain that this is an unnecessary overloading of meaning of work group in the spec.
Anyway I'm curious whether I can move the coarse bitmap to LDS, merging the two culling passes. Basically dispatch a "coarse" tile, which loops over all the incoming primitives, writing the bitmap to LDS (16KiB per tile, too much?), then switch to one invocation per "fine" tile, looping over the data in LDS and writing out the final culled tiles.
Has anyone ever made a good system for gameplay queries? (of the "give me all the apples nearby" or "trigger when x happens" style) I'm somewhat of the opinion that any generalized approach here is going to end up in tears, but I'm also of the opinion that gameplay scripting in particular wants to have a pretty generalized approach to help with auditing content (e.g. you want to tightly control what data scripts can access, and when they can access it).
@dotstdy Why would it end in tears? Unless you take an ideologically motivated hard-line approach (i.e. you have to do everything through the query system) it would presumably just be a toolbox just like we already have a toolbox of specific gameplay queries like ray/sphere casts or whatever. Now that I think about it, those do usually end in tears if you care about performance. :)
@pervognsen yeah specifically for performance, and specifically my tears around shipping season. :')
edit: and raycasts also kind of count, as soon as you start driving the logic for individual casts from within a gameplay script it's not going to end well I think. like it's totally fine in the small scale, but absurdly bad for things like ground-snapping. (largely because it makes it very difficult to do sweeping changes like batching and rate limiting across all your content)
I find it funny the way people give away their own lack of understanding sometimes. Reading this comments on this (why i do this) https://bikepacking.com/plog/man-or-bear-debate/ and there's the classic "[RE: meeting women on the trail] Never in my silly mind crossed the thought of me making them uncomfortable."
If it has never crossed your mind that you might be making somebody uncomfortable - that's a you problem - and specifically one of the core issues the whole damn piece is talking about!
There's a lot of things that pretty trivially give away when somebody hasn't even attempted to rub two brain cells together on a problem, but a man saying "ive never even considered the possibility of making a woman uncomfortable when i meet them alone in the wilderness" is a real banger.
To be honest I'm surprised the "one small rock a day" diet hasn't caught on outside of Sweden. Arguably it's one of the biggest reasons swedes are all so healthy and attractive. The rest of the world could learn a lot from Europe.
It's always fun when you remove a parallel foreach and do some basic re-organization and end up with a result that's 25% faster on a single core than the old version was spread across 10 cores.
Err is the GPU supposed to fault if you try to bind a descriptor set to a bind point before you bind a pipeline? I couldn't see any mention of there being an ordering requirement on the spec, but maybe I missed it in the sea of VUIDs
vkCmdBindPipeline()
vkCmdBindDescriptorSets()
vkCmdDispatch() // totally fine
I feel like the most difficult part of subgroups and GPU programming in general, is getting all the terminology straight in your head. Sometimes it seems like it would be easier just writing rdna asm directly. :')
I bet a lot of people think of themselves as important game development influencers, but the simple truth is that in the last decade or so the single most impactful game development thought leader is the "can you pet the dog" twitter account