Ehhhh the newest big GPU has arrived!... - High-Performance Computing

fclc, 2 months ago

Ehhhh the newest big GPU has arrived!

And you can have two of them connected to Grace!

#B100 and #GB200 for the GPU itself and the #grace +2X GPU version

#hpc #GTC

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Image

Image alternative text

fclc, 2 months ago

30K GPU? That’s a big machine

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fclc, 2 months ago

Interested in the coherence claims for #blackwell #HPC

Is it the same “level” as hopper?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fclc, 2 months ago

Oh hey! #mixedprecision! That’s my thing!

What is it Jensen? OCP? FP8? MXfloat? Death to TF32?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fclc, 2 months ago

NO JENSEN NO NO NO

NO FP4 NO FP6, NONE OF THIS NONSENSE.

PLEASE NO

#HPC #GTC

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ProjectPhysX, 2 months ago

@fclc FP4 arithmetic on Blackwell... 🖖🤣
Here is all possible values of the glorious FP4 format:

0111 +Inf
0110 +NaN
0101 +NaN
0100 +NaN
0011 +2.0
0010 +1.0
0001 +0.5
0000 +0
1000 -0
1001 -0.5
1010 -1.0
1011 -2.0
1100 -NaN
1101 -NaN
1110 -NaN
1111 -Inf

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ tojiro, aeva

amonakov, 2 months ago

@ProjectPhysX @fclc @fay59
that's not for real, right? it does not follow IEEE conventions (Inf should be next to 2.0, and 2.0 should be 1.5), and surely for fp4 you want a two-bit exponent so you have only one NaN of each sign:
0000 0.0
0001 0.5
0010 1.0
0011 1.5
0100 2.0
0101 3.0
0110 +Inf
0111 +NaN

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

steve, 2 months ago

@amonakov @ProjectPhysX @fclc @fay59 none of these strictly follow IEEE754 (you must have at least two significand bits so you can encode sNaN, and must have at least two exponential bits for reasons that fall out of the constraints on emin and emax), but yeah, this would be a lot closer.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

steve, 2 months ago

@amonakov @ProjectPhysX @fclc @fay59 (Give me unsigned FP4, you cowards!)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

steve, 2 months ago

@amonakov @ProjectPhysX @fclc @fay59 ((I cannot believe we’re up to like 12 different FP8 formats and not one of them is unsigned, AFAIK))

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rygorous, 2 months ago

@steve @amonakov @ProjectPhysX @fclc @fay59 graphics got you covered (not 8b, but 10b/11b) https://microsoft.github.io/DirectX-Specs/d3d/archive/D3D11_3_FunctionalSpec.htm#3.1.6%2011-bit%20and%2010-bit%20Floating%20Point

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rygorous, 2 months ago

@steve @amonakov @ProjectPhysX @fclc @fay59 (they show up in bitpacked R11G11B10 float)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

steve, 2 months ago

@rygorous @amonakov @ProjectPhysX @fclc @fay59 The graphics people of a couple decades back did almost everything the ML people are doing now and did it harder, episode 163/???

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rygorous, 2 months ago

@steve @amonakov @ProjectPhysX @fclc @fay59 goddamnit dude, harsh. I was there

I'm not even 40! I shouldn't feel this old!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rygorous, 2 months ago

@steve @amonakov @ProjectPhysX @fclc @fay59 I will say that it was interesting talking to mobile GPU compiler devs in the early 2010s where GLSLs FP rules boiled down to "don't be evil" on a coffee-stained napkin as a former PC GPU shader compiler dev where the requirements for our FP environment were quite a bit more nailed down, specifically the now public https://microsoft.github.io/DirectX-Specs/d3d/archive/D3D11_3_FunctionalSpec.htm#3.1%20Floating%20Point%20Rules

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ aeva

Add comment