steve,
@steve@discuss.systems avatar

This implementation of cos(πx) and sin(πx) for Float16 came out pretty neat: https://github.com/stephentyrone/swift-numerics/blob/trig-pi/Sources/_NumericsShims/sincospif16.s

Throughput for this scalar implementation is about 1.75ns/element on M1, which is ... tolerable. I expect that it could go somewhat faster, but this is nice and pretty tidy.

The equivalent pure Swift implementation I have takes about 2.3ns/element, largely due to not having a means to do a "convert float to integer, and just give me some don't-care value if its out of range" (which you can't spell in C either), but I already put a patch up to add the machinery that I need for it in Swift, so we'll get that straightened out and see where we stand.

krzyzanowskim,
@krzyzanowskim@mastodon.social avatar

@steve did you try use Copilot for another 55% faster?

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • DreamBathrooms
  • ngwrru68w68
  • tester
  • magazineikmin
  • thenastyranch
  • rosin
  • khanakhh
  • InstantRegret
  • Youngstown
  • slotface
  • Durango
  • kavyap
  • mdbf
  • tacticalgear
  • JUstTest
  • osvaldo12
  • normalnudes
  • cubers
  • cisconetworking
  • everett
  • GTA5RPClips
  • ethstaker
  • Leos
  • provamag3
  • anitta
  • modclub
  • megavids
  • lostlight
  • All magazines