julienbarnoin,
@julienbarnoin@mastodon.gamedev.place avatar

Somehow I was under the impression that the const keyword for function parameters in was mostly there for the benefit of programmers, so we'd get an error if we try to modify something we're not supposed to, but that the compiler would figure out if it does get modified or not on its own.

I was wrong. I have a case where just adding the const keyword to one parameter makes a shader twice as fast - from 750µs to 300µs.
Totally unexpected for me, am I the only one?

julienbarnoin,
@julienbarnoin@mastodon.gamedev.place avatar

Now this is also making me doubt another frequent assumption I make. has an "in" specifier for function parameters too, and I'm basically ignoring it most of the time because it seems to me that it's the same as not specifying anything if the parameter is not an inout. But does specifying "in" actually change anything to compilers compared to not specifying anything?

@gfxstrand do you know?

gfxstrand,
@gfxstrand@mastodon.gamedev.place avatar

@julienbarnoin So that's where I'm not sure. I was honestly a little surprised to see const in GLSL code to begin with. My memory is that the in is the default and inout has to be set manually. I didn't think const was a thing. I'm not sure how it's different from in. I would need to dig through the spec.

julienbarnoin,
@julienbarnoin@mastodon.gamedev.place avatar

@gfxstrand
Not sure about compilers but from the programmer's point of view, the difference between an in and const in parameter would be that modifying the passed in copy of the value within the function is allowed if it's an in but not const, and disallowed if it's const in.

gfxstrand,
@gfxstrand@mastodon.gamedev.place avatar

@julienbarnoin Right, so in that case it's likely the difference between GLSLang inserting an explicit copy (in case the called function overwrites it) vs. just referencing the data.

Depending on the details that can still be tricky to optimize away. In your particular case, in_Instrument contains an array that's then dynamically indexed. That's really hard to copy-propagate in general. I think Mesa can probably do it but it's right on the edge of what I would expect to work.

gfxstrand,
@gfxstrand@mastodon.gamedev.place avatar

@julienbarnoin If the compiler can't propagate the copy, then you're looking at a load of a bunch of data, probably a store to scratch memory, and then a load from scratch memory in the function. The reason for the trip through scratch is the indirect (non-constant) array access.

julienbarnoin,
@julienbarnoin@mastodon.gamedev.place avatar

@gfxstrand Alright, yeah. Well clearly the lesson here is to put const absolutely everywhere possible. I'm going to be going through a looot of shader code where I didn't do this, haha.

gfxstrand,
@gfxstrand@mastodon.gamedev.place avatar

@julienbarnoin Yeah, I think that's certainly a good recommendation for struct arguments. A simple vec4 probably doesn't matter but structs are likely to be big clunky things.

julienbarnoin,
@julienbarnoin@mastodon.gamedev.place avatar

@gfxstrand Yeah it seems like it must be the default, though I have a hard time confirming it. Just really makes me wonder why this keyword exists at all if it's already the default as it's basically never needed, that's what's making me doubt.

It's like if they decided to add a new keyword "nonconst" that you can decide to specify for things that are not const, but would be the default if not specified and changes nothing. That would be weird right?

julienbarnoin,
@julienbarnoin@mastodon.gamedev.place avatar

Here's the function for context. In this test it's called 182 times in a row in one compute shader invocation. With const, I have 270µs spent calling this function as measured by clockRealTimeEXT(). Removing this const it spends 720µs in it.

Most of that time isn't registered if I measure it inside the function either, only if I measure around the function call, the time goes in the BCA_Instrument struct copying before the call.

This is on Vulkan running on Linux, nVidia proprietary drivers.

aeva,
@aeva@mastodon.gamedev.place avatar

@julienbarnoin huh, neat. I've heard people say that declaring vars as const enables compiler optimizations, but I always just figured that was just in situations where the compiler couldn't reasonably infer the constness of a var, and that the primary benefit of const was defensive programming.

Goes to show it's better not to assume what the compiler will do.

gfxstrand,
@gfxstrand@mastodon.gamedev.place avatar

@julienbarnoin GLSL has silly copy-in-copy-out semantics for inout function arguments. Non-const function parameters like that map to something roughly like this in SPIR-V:

BC_I tmp = in_val;
func(&tmp);
in_val = tmp;

For const parameters, GLSLang is able to optimize that to

func(&in_val);

I'm a little surprised that NVIDIA's compiler can't still elide the copy (Mesa's stack is pretty good at this) but if it comes from an SSBO or similar, that can get tricky.

gfxstrand,
@gfxstrand@mastodon.gamedev.place avatar

@julienbarnoin Annoyingly, even though you might think those are roughly the same (though pointless in the first case), they have subtly different semantics and we can't properly implement GLSL semantics in GLSLang without doing those copies. 😭

I'm sorry.

Signed,

The person who wrote the code in GLSLang to do the copies. (But NOT the person who gave GLSL those dumb semantics that we're now forced to implement!)

julienbarnoin,
@julienbarnoin@mastodon.gamedev.place avatar

@gfxstrand Thanks for your comment, much appreciated !
Though, my initial expectation, as a totally non-expert, would be that in my example function I shared here https://mastodon.gamedev.place/@julienbarnoin/112197202516227190 the glsl to SPIRV compiler/optimizer would ideally realize that no assignment happens in that function and basically decides to pretend the argument is const.

Of course that's easy to say from a specific example and not the general case, but it seems reasonable in this specific case, right?

gfxstrand,
@gfxstrand@mastodon.gamedev.place avatar

@julienbarnoin Well, that's where it's tricky. The level at which the copy transformation is done to preserve GLSL semantics is in GLSLang. GLSLang isn't really a compiler so much as a transpiler. It does a pretty native translation from GLSL to SPIR-V. By the time the Nvidia compiler gets it, the copies are already there and it's a question of whether or not their compiler is smart enough to remove them.

If the copy is in registers, that's pretty easy. If it's in an SSBO, it's much harder.

oblomov,
@oblomov@sociale.network avatar

@gfxstrand @julienbarnoin does this have anything to do with potential aliasing or implicit volatility (in C parlance), so that the load has to be forced because elision would require making assumptions that wouldn't be valid in stream processing context?

gfxstrand,
@gfxstrand@mastodon.gamedev.place avatar

@oblomov @julienbarnoin Yes, I think? I'm having trouble parsing the original question. But it's less because of aliasing and more because GLSL doesn't have pointers/references at all. If you write

void foo(inout a, inout b) {
a = 3;
b = b + 5;
}

and then you called

x = 7;
foo(x, x);

With C or C++ reference semantics, that would turn into

x = 3;
x = x + 5;

and x would be 8. In GLSL, it's

a = x;
b = x;
a = 3;
b = b + 5;

// Then, in some order...
x = a;
x = b;

and x would be 3 or 12

oblomov,
@oblomov@sociale.network avatar

@gfxstrand @julienbarnoin oh nice example, very clear (and in some sense this is an aliasing issue, although not via pointers in the traditional C sense). And it is entirely due to what can happen in a single work-item (I'm more used to work with things like OpenCL and CUDA, where, work-items can step on each other's toes). Thank you very much.

julienbarnoin,
@julienbarnoin@mastodon.gamedev.place avatar

@gfxstrand To be clear though, I really wasn't looking for anyone to blame, mostly to understand and spread the awareness that this can be a super easy way to make a difference in glsl shaders. I'm definitely going to be applying liberal amounts of constness in the future.
(Or even better, avoid passing in structs as much as possible, which is what I'll do here)

  • All
  • Subscribed
  • Moderated
  • Favorites
  • gamedev
  • ngwrru68w68
  • DreamBathrooms
  • thenastyranch
  • magazineikmin
  • InstantRegret
  • Durango
  • Youngstown
  • everett
  • slotface
  • rosin
  • cubers
  • mdbf
  • kavyap
  • GTA5RPClips
  • JUstTest
  • cisconetworking
  • osvaldo12
  • ethstaker
  • Leos
  • khanakhh
  • normalnudes
  • tester
  • modclub
  • tacticalgear
  • megavids
  • provamag3
  • anitta
  • lostlight
  • All magazines