Posts

This profile is from a federated server and may be incomplete. Browse more on the original instance.

chandlerc, to random
@chandlerc@hachyderm.io avatar

Playing with "overflow byte" approach to metadata+simd augmented hashtable design.

I've read that this is part of the F14 design, but I found the explanations for Boost's new unordered_flag_map to be what got the point across.

I implemented the core of this in my own way on top of my hashtable -- rather than stealing a byte from the metadata the way Boost does, I just tack on an array of 1-byte-per-group "probe markers" (a term that makes more sense to me than "overflow byte".

chandlerc,
@chandlerc@hachyderm.io avatar

@pervognsen Not whose probe sequence starts, but reaches that group. Which is ... a bit weirder. Especially with quadratic probing.

The bits are only ever set once the group is full, but despite being full 7/8th of probes should terminate when hitting it is I think the theory.

But that doesn't seem to really materialize for my implementation. Now I'm wondering if I've got bugs or something, and this is just a weirder way of tracking "was there an empty metadata byte".

chandlerc,
@chandlerc@hachyderm.io avatar

Intersetingly, on Arm where I use a group size of 8 (and SWAR instead of SIMD as its faster), I'm seeing the probe marker result in significant more probing. More keys probe, and probe further (likely as a consequence). But only for some keys data types. Weird. But not encouraging for this technique despite it being conceptually simpler.

chandlerc, (edited ) to random
@chandlerc@hachyderm.io avatar

Noooo... cppreference is down, how will I write code?

EDIT: back up now for me it seems....

DanielaKEngert,
@DanielaKEngert@hachyderm.io avatar

@chandlerc Is it?
At least, it's not slow here and not slow now.

chandlerc,
@chandlerc@hachyderm.io avatar

@DanielaKEngert Back to normal now for me too. 🤷🏻

chandlerc, to random
@chandlerc@hachyderm.io avatar

It's not an accident that the image of the guy with the "change my mind" poster is a racist, homophobic, abuser. And most of the memes using that template were horrible. It was never an invitation to change his mind.

P.S.: Don't use that meme template. Was horrified when I learned the full story. And no, don't ask me to cite my sources, use a search engine.

https://hachyderm.io/@mekkaokereke/112515952618311049

chandlerc, to random
@chandlerc@hachyderm.io avatar

C++ data structure API design question...

What are folks favorite ways to design a data structure that supports users providing two closely coupled custom functions? Why that pattern?

Specifically, imagine a hash table data structure that wants to allow users to deeply customize both the hash function and the equality comparison.

Current ideas, w/o ranking or even saying I like them, and interested in others:

  • A type parameter with static functions
  • two lambda template parameters
  • CRTP
pervognsen,
@pervognsen@mastodon.social avatar

@zwarich @chandlerc (BTW, I think that storage proposal would work at least as well in C++ as in Rust and I believe Matthieu said he had first used the idea in the C++ code base at his day job.)

chandlerc,
@chandlerc@hachyderm.io avatar

@zwarich @pervognsen SOOOO much yes on implicit parameters.

We'll get there with Carbon. ;]

chandlerc, to llvm
@chandlerc@hachyderm.io avatar
chandlerc, to random
@chandlerc@hachyderm.io avatar

Saw this again, and it remains excellent.

noahf,
@noahf@hachyderm.io avatar

@chandlerc
I aways wondered if geese ever experience gander dysphoria.

chandlerc, to random
@chandlerc@hachyderm.io avatar

Northern lights in the SF bay area (up on our mountain), wild.

Thanks to @hanadusikova for having the right phone camera to get the great photo.

chandlerc,
@chandlerc@hachyderm.io avatar

More wild in some ways is one of our nest cameras looking out over half moon bay. It ... is sparkling. Like, you can see a few bright spots in this screen capture, but its constantly sparkling throughout the dark regions (need to turn up brightness to see).

I don't have any explanation for this other than the CME... We've zoomed in like this to look at the lights from half moon bay glowing in the night before, no sparkles. So weird.

chandlerc,
@chandlerc@hachyderm.io avatar

"Free dental X-rays" says @hanadusikova

chandlerc, to random
@chandlerc@hachyderm.io avatar

Second edition of the Carbon Copy is out!

https://github.com/carbon-language/carbon-lang/discussions/3869

This one discusses Carbon's "unformed state" as a way to address safety risks of uninitialized data but maximizing the control over exactly which tradeoff should be taken, and maximizing the ability to diagnose or mitigate bugs.

chandlerc,
@chandlerc@hachyderm.io avatar

And also just arrived in Vienna Austria for EuroLLVM! Looking forward to seeing folks after way too long and catching up. Can even pester us with questions about unformed state on the Carbon panel!

chandlerc, to random
@chandlerc@hachyderm.io avatar

Needed a term for the unsustainable, irrational, and often quite bumbling and foolish excitement when you're first fully healthy after being really sick for a few days...

ZOOMIES!!! YES I HAVE POST-SICK ZOOMIES!!!!

Now to go contain myself and not slide right back to being sick....

chandlerc, to random
@chandlerc@hachyderm.io avatar

I continue to think this is one of the biggest insights I have had professionally:

Communicating "up" to "senior leadership"[1] is almost entirely about iterative synthesis and refinement of a reasonable abstraction for them to use to understand and communicate about what you're actually doing/proposing/asking/etc.

[1]: In whatever form this takes. But specifically folks as far away as you might call "executives" in a business context or similarly breadth/scale in another org context

chandlerc,
@chandlerc@hachyderm.io avatar

Like, yes, you need to ask for a thing, or propose a thing, or ....

But 90% of the effort is causing there to be a suitable abstraction at the relevant leadership level for them to understand, make any decision, and communicate w/ peers or their leadership about it.

And mostly, finding that abstraction I think is what is the most difficult, impactful, etc.

chandlerc, to random
@chandlerc@hachyderm.io avatar

A random and weird AArch64 performance question I'm mulling...

Which instruction sequence in a linear dep chain is better?
a)

lsr ... #7  
rbit ...  
clz ...  

b)

rev ...  
clz ...  

It seems like (b) should be clearly superior.... unless some uarch does fusion of rbit and clz to a single uop (ctz-esque). If there is fusion (a) could easily be better...

Anyone know of such a uarch? Worth avoiding (b) in case of a future one?

(I'm working on getting measurements for the M1...)

dougall,
@dougall@mastodon.social avatar

@chandlerc I'd stick with (b) – if there were fusion, then they'd be equal on uops, but (a) would be worse on instruction count and code size, so still worse in general. Latency should be 1c for each of these instructions just about everywhere, though there are a lot of different CPUs, so feel free to double check that.

I also don't believe such fusion exists – I spent time looking through various manuals to try to find patterns to test for on M1, and I don't recall anything like that.

chandlerc,
@chandlerc@hachyderm.io avatar

@dougall Thanks!

This is also the direction I liked due to code size... But I also couldn't help wondering if I'm just too excited about the fun of using byte-reverse instead of bit-reverse to avoid moving the set bit around. ;] Good to know that you've not seen anything crop up that would point away from it or towards fusion of this case.

chandlerc, to random
@chandlerc@hachyderm.io avatar

For C++ library folks -- should const on containers propagate to the elements in the container? why? (or why not?)

And if "yes", why should span (or equivalent) not take the same path? Or should it?

(To be clear, I have lots of my own thoughts on all of these questions. I'm not asking because I'm unaware of any possible answers, but to see how others think about them.)

chandlerc,
@chandlerc@hachyderm.io avatar

@Paxxi @resistor He mentioned that to an extent, it's an analogous case of propagation of const making it harder to use.

But all I was clarifying is whether the requirement to reach for tools like mutable would be a negative aspect. Some folks see it as a code smell, others don't.

Paxxi,
@Paxxi@hachyderm.io avatar

@chandlerc @resistor ahh right.
I'm fighting myself on this one. For e.g. String I'm thinking const should mean const but for collections I'm thinking const should not propagate to elements.

I don't have any real arguments either way, just my feelings 😀

chandlerc, to random
@chandlerc@hachyderm.io avatar

Happy new years everyone!!!

I'm super excited for at least the tech things in 2024.

Also, stay safe friends in Japan!

chandlerc, to llvm
@chandlerc@hachyderm.io avatar

Is there a good reason targeting doesn't seem to fold shifts into operands when it would require shifting in multiple operands?

I'm seeing lots of:

lsr xN, xN, #7  
and x?, x?, xN  
...  
and x?, x?, xN  

With no other uses of xN.

Is there a reason to prefer this over:

and x?, x?, xN, lsr #7  
...  
and x?, x?, xN, lsr #7  

While "duplicated", it seems like it would save an instruction at least in decode?

TomF,
@TomF@mastodon.gamedev.place avatar

@steve @chandlerc Haha - "lea" is such an ugly weird little instruction, but it turns out it's so annoyingly useful it sneaks into every arch :-)

chandlerc,
@chandlerc@hachyderm.io avatar

@TomF @steve I mean, I'm somewhat aware of the diversity of uarch's out there.... And I don't really want more knobs in the compiler. I hate them.

But I'm specifically saying that thresholds where encoding A vs. encoding B results in 2 vs. 1 uop seem very important to document and teach compilers about. Not every other difference. =D Nicer to not have them at all, but if they exist, we need to know? And this doesn't seem like a terribly frustrating threshold to model.

chandlerc, to random
@chandlerc@hachyderm.io avatar

I'm really exhausted of complaining about the project's auto-close response that has instructions no one outside the Bazel team can follow for keeping actual issues that are impacting users open.

Has anyone played with Buck2? Any example C++ projects using it that I could look at?

BoredomFestival,
@BoredomFestival@sfba.social avatar

@chandlerc If the Bazel team won't listen to your complaints about C++ issues, I guess we are doomed to continue with CMake forever :⁠-⁠\

  • All
  • Subscribed
  • Moderated
  • Favorites
  • megavids
  • thenastyranch
  • rosin
  • GTA5RPClips
  • osvaldo12
  • love
  • Youngstown
  • slotface
  • khanakhh
  • everett
  • kavyap
  • mdbf
  • DreamBathrooms
  • ngwrru68w68
  • provamag3
  • magazineikmin
  • InstantRegret
  • normalnudes
  • tacticalgear
  • cubers
  • ethstaker
  • modclub
  • cisconetworking
  • Durango
  • anitta
  • Leos
  • tester
  • JUstTest
  • All magazines