@chandlerc@hachyderm.io avatar

chandlerc

@chandlerc@hachyderm.io

Software, performance, optimization, programming languages, security, open source, #CarbonLang lead, #LLVM, #Clang, C++. 🏳️‍🌈 http://pronoun.is/he or http://pronoun.is/they

This profile is from a federated server and may be incomplete. Browse more on the original instance.

chandlerc, to random
@chandlerc@hachyderm.io avatar

Playing with "overflow byte" approach to metadata+simd augmented hashtable design.

I've read that this is part of the F14 design, but I found the explanations for Boost's new unordered_flag_map to be what got the point across.

I implemented the core of this in my own way on top of my hashtable -- rather than stealing a byte from the metadata the way Boost does, I just tack on an array of 1-byte-per-group "probe markers" (a term that makes more sense to me than "overflow byte".

chandlerc,
@chandlerc@hachyderm.io avatar

Initially, I just wanted to confirm the statistical benefits that are espoused for this approach.

Because I'm really quite bad at math, statistics, and modelling, the easiest way for me was to implement it and some detailed metrics, and then run my benchmarks and see what happens to average probe length.

The big idea is using one bit of the probe marker (or "overflow byte") to track whether a particular key needs to keep probing past the current group. How does this reduce probe length?

chandlerc,
@chandlerc@hachyderm.io avatar

My understanding (please lemme know if I've misunderstood):

You use more bits from the hash to select the bit of the byte for a particular key. This basically lets you probabilistically end the probe early.

Compared to the tombstone approach of SwissTable, where the only end to a probe sequence is a group with an empty slot. All the keys share this terminating condition, but with the 8 bits, you should only share the termination with 1/8th of keys?

And so it should terminate (much) sooner?

chandlerc,
@chandlerc@hachyderm.io avatar

This does seem to work. All my fancy testing passes.

But the probe distance improvements? Tiny. Like, miniscule. Way less than I would have hoped.

Or is it? sometimes I see 30% drop in average distance. But the average goes from 0.072 to 0.051...

Even the max distance reductions are very moderate.

I don't really see large probe distances. With the most probe-prone data set (identifier like strings), average across all keys is at worst 0.10.

chandlerc,
@chandlerc@hachyderm.io avatar

And the measured timings don't seem to improve.

I've not yet really looked at how optimal the generated code is yet because I was mostly hoping to see an improvement in the computed average probe distances first...

Maybe I will though. It is in some ways nicer code, simpler than searching for empty and tombstone metadata. And the added memory is tiny. But a little perplexed at the results.

chandlerc,
@chandlerc@hachyderm.io avatar

@pervognsen Not whose probe sequence starts, but reaches that group. Which is ... a bit weirder. Especially with quadratic probing.

The bits are only ever set once the group is full, but despite being full 7/8th of probes should terminate when hitting it is I think the theory.

But that doesn't seem to really materialize for my implementation. Now I'm wondering if I've got bugs or something, and this is just a weirder way of tracking "was there an empty metadata byte".

chandlerc,
@chandlerc@hachyderm.io avatar

Intersetingly, on Arm where I use a group size of 8 (and SWAR instead of SIMD as its faster), I'm seeing the probe marker result in significant more probing. More keys probe, and probe further (likely as a consequence). But only for some keys data types. Weird. But not encouraging for this technique despite it being conceptually simpler.

steve, to TVTooHigh
@steve@discuss.systems avatar

TFW but you left multiple feet of wall above it anyway. Aim higher! Do better!

chandlerc,
@chandlerc@hachyderm.io avatar

@steve See, I feel like people hav ethe TV-too-high thing all wrong.

Y'all seem to be watching TV sitting upright like it's time to play the piano or something. You need to be laid back watching TV. The call it a recliner because it RECLINES people.

Then the TV's gonna be at a fine height, just tilt it down a bit. I'd rather a bit higher TV and not have to be looking down my nose all the damn time.

🤡

chandlerc, (edited ) to random
@chandlerc@hachyderm.io avatar

Noooo... cppreference is down, how will I write code?

EDIT: back up now for me it seems....

chandlerc,
@chandlerc@hachyderm.io avatar

@jawnsy my memory is too poor to even write bad code without a reference open next to me...

chandlerc,
@chandlerc@hachyderm.io avatar

Maybe not down, just super slow? Same DDOS as hitting archive.org?

chandlerc,
@chandlerc@hachyderm.io avatar

@alexr Yeeeaaaaah....

Also, definitely writing code for fun and enjoyment today. =D

chandlerc,
@chandlerc@hachyderm.io avatar

@DanielaKEngert Back to normal now for me too. 🤷🏻

chandlerc, to random
@chandlerc@hachyderm.io avatar

It's not an accident that the image of the guy with the "change my mind" poster is a racist, homophobic, abuser. And most of the memes using that template were horrible. It was never an invitation to change his mind.

P.S.: Don't use that meme template. Was horrified when I learned the full story. And no, don't ask me to cite my sources, use a search engine.

https://hachyderm.io/@mekkaokereke/112515952618311049

resistor, to random
@resistor@mastodon.online avatar

Quick and dirty encoding of Altivec into RISC-V: https://gist.github.com/resistor/bc82bd259b662084a5950ac9b863d8c2

chandlerc,
@chandlerc@hachyderm.io avatar

@resistor My impression has always been that SVE was actually great from an encoding perspective -- would it fit?

chandlerc,
@chandlerc@hachyderm.io avatar

@resistor :blobfoxcrylaugh:

shauna, to animals
@shauna@social.coop avatar

Brushing my teeth in the kitchen sink because my cat is sleeping in the bathroom one

chandlerc,
@chandlerc@hachyderm.io avatar

@shauna This is the way.

chandlerc, to random
@chandlerc@hachyderm.io avatar

C++ data structure API design question...

What are folks favorite ways to design a data structure that supports users providing two closely coupled custom functions? Why that pattern?

Specifically, imagine a hash table data structure that wants to allow users to deeply customize both the hash function and the equality comparison.

Current ideas, w/o ranking or even saying I like them, and interested in others:

  • A type parameter with static functions
  • two lambda template parameters
  • CRTP
chandlerc,
@chandlerc@hachyderm.io avatar

While kinda minimal, the lambda approach gives me lots of pause about ergonomics if inline in the template arg list isn't a good place to write the customization. Also awkward to have N>1 of these, or do any additional customization.

Currently leaning towards CRTP because users will very often want to give the table type a meaningful local type def & name anyways. Can avoid needing to also come up with a name for a type param.

chandlerc,
@chandlerc@hachyderm.io avatar

@pervognsen yeah, I alryhave all the heterogenous side of this. (It's easier in C++)

But the context I'm looking at next is really awkward to do even with new type stuff....

I want to pass in context that is used both by the lookup value and the stored values for all operations. Including growth where the lookup value (and thus the new type) has no involvement at all.

chandlerc,
@chandlerc@hachyderm.io avatar

@swym Hmm... with inheriting constructors, most of that should happen automatically... Unsure about assignment, but I feel like it could be not-too-bad....

chandlerc,
@chandlerc@hachyderm.io avatar

@foonathan On one hand, yes to everything you say....

But on the other hand, it is quite nice that this gives a clean and unsurprising path to inject state if useful....

I guess, my question is -- what goes wrong with CRTP?

chandlerc,
@chandlerc@hachyderm.io avatar

@foonathan Storing the policy isn't enough -- you need to figure out how to initialize it in constructors etc. And those can be tricky to build.

Recently, I've found the boilerplate of inheriting constructors much (much) more minor than it has been in the past... But maybe I'm forgetting something. I may try the CRTP approach and see how it goes in particular because the fully expected use case will end up with a typedef for the type, and needing to pick two tightly coupled names otherwise...

chandlerc,
@chandlerc@hachyderm.io avatar

@foonathan both of those costs though are in the infra where I don't mind. Only really worried about the boilerplate in the users.

chandlerc,
@chandlerc@hachyderm.io avatar

@zwarich @pervognsen SOOOO much yes on implicit parameters.

We'll get there with Carbon. ;]

  • All
  • Subscribed
  • Moderated
  • Favorites
  • megavids
  • thenastyranch
  • rosin
  • GTA5RPClips
  • osvaldo12
  • love
  • Youngstown
  • slotface
  • khanakhh
  • everett
  • kavyap
  • mdbf
  • DreamBathrooms
  • ngwrru68w68
  • provamag3
  • magazineikmin
  • InstantRegret
  • normalnudes
  • tacticalgear
  • cubers
  • ethstaker
  • modclub
  • cisconetworking
  • Durango
  • anitta
  • Leos
  • tester
  • JUstTest
  • All magazines