Replies

This profile is from a federated server and may be incomplete. Browse more on the original instance.

alcinnz, to random
@alcinnz@floss.social avatar

In our hypothetical string-centric hardware basically all the unpredictable memory accesses to the Parsing Unit's instruction prefetch/predocode. Yet there's only so much it'd be able to handle a time, hence we'd periodically need to optimize the Parsing Unit's code to fit in a rigid structure we can hope to prefetch/predocode!

We'd implement a smaller-scale firmware version triggered upon inserting new rules into existing parsers, & a larger-scale version as part of the compiler.

1/6?

alcinnz,
@alcinnz@floss.social avatar

We'd require peephole optimization & state-renaming pass to ensure the other optimizations don't make things worse.

An interesting technique we could use is to rewrite the subgraph we're concerned with to label all edges identically, so we can list every state reachable within n steps! Upon this we could gather a bitmask (in Arithmetic Core) or trie (in Parsing Unit, built on the firmware) of every reachable state, & in turn build GC discarding unreachable states!

If that's not enough...

2/6?

alcinnz,
@alcinnz@floss.social avatar

A similar technique could gather all states reachable along the same label (& maybe several optional "eta" edges) from the same starting states. Rearrange the output into a renaming table, & we could turn our "NFA" into a "DFA" . Then again our hardware handles NFAs fine...

However if we convert our inverse-NFA into a DFA & uninvert it that'll remove plenty of redundancy & tidy up trailing branches! Inverting the graph may be a 2-step process populating a mapping before reformatting it.

3/6?

alcinnz,
@alcinnz@floss.social avatar

If we thus fail to shrink the code, next we'd turn our attention to finding the best place to split it! Ideally we'd split it into "connected components" so once we transition out of a subgraph we won't return to it.

An algorithm which suits this hardware would remove an arbitrary edge & examine which states are still reachable from each states it connected. If its reachable from both we'd ideally keep that subgraph together, recursing to split up the other 0-3 subgraphs (if too large).

4/5?

alcinnz,
@alcinnz@floss.social avatar

If the connected components remain too large, we'd need to heuristically choose a less-ideal split point. I'd aim to minimize the number of external edges, to make room for more to be added later. But mostly the choice would be quite arbitrary.

Talking about external edges... Especially after these optimization They could help indicate which other subgraphs should be merged together!

However if a single state imports to many edges, the machinecode would need to give up on prefetching!

5/5Fin!

alcinnz, to random
@alcinnz@floss.social avatar

Now that Linux's [f]init_module syscalls have loaded & linked a given module into kernel-space, there's a little bit more initialization they do.

After copying commandline-style args into kernelspace they locates & parses a build ID from the module, possibly initializes some debugging (rabbithole) data, validates that we're not exporting duplicate symbols, collects "bug" & "kcfi" table entries, updates CPU-specific memorytables to enforce r/w/x permissions, maybe enables debugging, ...

1/2?

alcinnz,
@alcinnz@floss.social avatar

... iterates over livepatch regions resolving additional symbols with some additional CPU-specific code & running callbacks, notifies registered callbacks that a module's been loaded, parses those loaded commandline-style args (rabbithole), constructs various "object" files in Sysfs (rabbithole) to expose the module's data, considers extracting livepatch info from the ELF file, frees some temp data, possiblers sums some stats, gathers data to be inserted into a linkedlist for later freeing, ...

alcinnz,
@alcinnz@floss.social avatar

... iterates over "constructors" in the module to call them, calls its init function under lock unless blocklisted whilst harvesting entropy, updates module's state, runs registered callbacks, dealays uevents & finishes all async code (rabbithole), frees some debugging data, tidies up some memory, & commits changes.

Then there's all the locking, & a failure codepath... I'm not digging into that!

3/3 Fin for today! Tomorrow: delete_module() syscall!

alcinnz, to random
@alcinnz@floss.social avatar

Can we please move onto the next hype cycle already? Its tiring how many people expect me to be interested in this one!

alcinnz,
@alcinnz@floss.social avatar

@gianarb That'd be one I appreciate!

18+ doctormo, to random
@doctormo@floss.social avatar

[snip]
"That puts the most crazy copyright owner as an equal participant in deciding whether or not generative AI can work the way the model makers want it to"

Er, boys, that's law. Let's rephrase that so you know what utter villains you are:

"That puts the most crazy grocer as an equal participant in deciding whether or not the business of apple pies can work the way the great apple thieves want it to."

Ask/pay first, you knobs.

18+ alcinnz,
@alcinnz@floss.social avatar

@doctormo Techdirt, they're big anti-copyright advocates!

While I'm definitely sympathetic there, those talking points are easily repurposed to defend these plagiarists!

rysiek, to random
@rysiek@mstdn.social avatar

"S" in "LLM" stands for "Secure"

alcinnz,
@alcinnz@floss.social avatar

@rysiek And the "T" stands for "Trustworthy"!

AnarchoNinaWrites, to random
@AnarchoNinaWrites@jorts.horse avatar

Literally what I've been telling you for the past two years straight.

So presumably y'all are gonna crawl into Brown's mentions now with insightful commentary like "be sure to help Drumpf win" and "you are a Russia" right?

https://www.msnbc.com/opinion/msnbc-opinion/biden-trump-democrats-immigration-election-rcna151979

ht to @gwynnion for sharing the story in the first place, just didn't want to shit up her replies with my personal irritation at dumb fuck reply guy libs on here...

18+ alcinnz,
@alcinnz@floss.social avatar

@AnarchoNinaWrites @gwynnion President Joe Biden should consult with former NZ prime minister Chris Hipkins to hear how well this strategy worked...

alcinnz, to random
@alcinnz@floss.social avatar

In our hypothetical string-centric hardware the "Parsing Unit" would perform most of the computation, including all control flow for the Output Unit! So how'd we program this Parsing Unit?

Its microcode would use bitmask addressing, consulting lookuptables for all active states. Whilst updating appropriate preprocessing registers, pushing/popping a stack, & synthesizes Output Unit opcodes.

Except that circuit won't scale to any non-trivial syntax...

1/6?

alcinnz,
@alcinnz@floss.social avatar

So I'd have it concurrently prefetch & predecode a convenient & compact machine code in a rigid structure amenable to prefetching. It'd be worth including string literals here!

Upon that I'd bundle some firmware into the CPU including:

  • A mainloop dispatching events & interpreting much (but certainly not all) of the Output Unit's opcodes.
  • Prepare a callstack for a new process, returning to the mainloop.
  • Enforce Output Unit's object capabilities upon the Parsing Unit.

2/5?

alcinnz,
@alcinnz@floss.social avatar

But most significantly there'd be firmware for a splicing new rules (with different object-capabilities enforced) into an existing program, running optimizations (topic for tomorrow!) where we inevitably overflow the rigid structure we can hope to prefetch/predecode. Enforcing those object-capabilities would involve rewriting certain opcodes.

A Linker (early-boot) would prefetch additional parsers from persistant storage, adding initial Output Unit code to recursively decode & link them!

3/5?

alcinnz,
@alcinnz@floss.social avatar

That Linker would parse filepaths out of comments in the machinecode, & insert placeholder imports to populate later during the lazy-recursion. Allowing for components (given enough resiliency) to be hotswap-updated.

Given all this, implementing a textual syntax around the machinecode (with explicit names as opposed to indices) could aid implementing visualization (later topic!), disassembly, codegen, & (where-needed) optimization.

4/5!

alcinnz,
@alcinnz@floss.social avatar

Finally (for today!) to make this Parsing Unit's code human-legible we'd need to define expressions which allows us to relate states without explicitly referencing them, naming every tiny little thing would after all get irritating. Syntax resembling EBNF or regex, but using nil bytes to quote literal strings.

The EBNF syntax would embed controlvar assignment, Output Unit code, & FPMA code to aid treating them as a single unit. Output/FPMA code compile to separate pages, with pointers.

5/5

jonikorpi, to random
@jonikorpi@mastodon.gamedev.place avatar

Sooo how do we protect the web from AI-enshittification? A new search engine? Perhaps powered by human curation? Easier self-publishing via better social media? Something else?

alcinnz,
@alcinnz@floss.social avatar

@jonikorpi All of the above?

For a search engine, I'm enjoying https://searchmysite.net/ ...

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • kavyap
  • DreamBathrooms
  • thenastyranch
  • magazineikmin
  • osvaldo12
  • khanakhh
  • Youngstown
  • mdbf
  • slotface
  • rosin
  • everett
  • ngwrru68w68
  • Durango
  • anitta
  • InstantRegret
  • GTA5RPClips
  • cubers
  • ethstaker
  • normalnudes
  • tacticalgear
  • cisconetworking
  • tester
  • Leos
  • modclub
  • megavids
  • provamag3
  • lostlight
  • All magazines