Corbin

@Corbin@programming.dev

This profile is from a federated server and may be incomplete. Browse more on the original instance.

Corbin,

Nah, even if there were one holistic/catholic/apostolic/ecumenical GNU/Linux distribution, it would not follow that it’s “better than Windows” for many folks, let alone that “people would move”. Folks are very slow to adopt new technologies, very hesitant to step outside of an established market duopoly, and generally not prepared to work with computers as they are.

Also, you’d have to switch to NixOS.

Corbin,

python3Packages.scikit-image appears to be available and non-broken in nixpkgs; on my machine, I get /nix/store/w8681ncsw92cn4gq6gyraw4z19r0r6c3-python3.11-scikit-image-0.21.0. Do you have an actual example?

I understand your point, but given nixpkgs’ position in the community, it might be a moot point.

Corbin,

The NixOS community is forked on a regular basis. So is the nixpkgs community. The focus here is on control of CppNix, e.g. blocking the autotools-to-meson PR.

It’s pretty easy to have opinions that the mods and Foundation don’t like. Be polite and you’ll be fine. Also, don’t be like Jon, who thinks “polite” means smiling through gritted teeth and being passive-aggressive.

Corbin,

There are subfields of computer science dedicated to this question. A good starting point for the theory would be Pessimal algorithms and simplexity analysis, which lays out two concepts:

  • The time & space simplexity of an algorithm indicates best-case lower bounds on resource usage, and
  • An algorithm is pessimal if no equivalent algorithm wastes more time/space/etc.

For example, common folklore is that sorting has O(n lg n) time complexity, depending on assumptions. In the paper, they give that sorting has Ω(n ** (lg n / 2)) time simplexity; any algorithm which takes more time, like bogosort, must do so through some sort of trickery like non-determinism or wasting time via do-nothing operations.

Writing an EBNF -> Scheme parser generator using AI for machine-unfriendly descriptions, and further enhancements?

Hey. I have made some DSLs before, and for all of them, most of which were in C, I have used Flex and Bison. But this time I wanna use Scheme, Cyclone Scheme to be exact. I can potentially use Flex/Bison this time too, because Cyclone has a strong FFI to C....

Corbin,

I’m not sure I understand your reasoning. Here are the highlights for me.

But this time I wanna use Scheme, Cyclone Scheme to be exact.

Why? Scheme is a fine choice in general, but you should only tie yourself to a specific flavor of Scheme if you need specific features…

because Cyclone has a strong FFI to C.

That’s a good reason! Keep in mind that there are usually multiple possible Schemes. In this case, I’d be aware of CHICKEN Scheme as well.

I want it to translate EBNF into Scheme.

Scheme what? What sorts of programs? Presumably a lexer and parser and AST manipulation kit? In a certain sense, that’s all that one can do with grammars. Don’t get me wrong – if you can parse Antlr grammars, then you have hundreds of popular languages at your fingers. But I think that you need to clarify your three languages: your source, your target, and your implementation. I like to use tombstone diagrams for this.

I think an LLM can help.

No, I’m afraid not. Even if all the popular claims about them were true, LLMs wouldn’t help you understand compilers; at best, they might drop some phrases that can be looked up on Wikipedia.

I need an LLM that would be used in an EBNF -> Scheme parser generator.

Wants are not needs. The parser generator could be handwritten. This was a painful lesson for me when I was younger; I faffed around with a parser generator for a couple weeks, and then one of the old hands wrote two thousand lines over a week, lexer and parser and test cases.

First draft grammar of Xeph, a non-pure functional language, a superset of ASDL. Input is needed. (+ guide on reading EBNF) (gist.github.com)

Please tell me what you honestly think of it — this is not a ‘labor of love’ exactly, I only spent a few hours on it. But I think its a strong foundation for what comes next. The implementation, I hope, will be an orgy of love!...

Corbin,

It looks alright. You’ll have to use it for a few months before knowing whether it’s comfortable.

To be honest, I’m not a fan of variables; I’m in the tacit/concatenative camp. But I think it’s good to try new things and learn for yourself why they are good or bad.

Pretty critical PR for rust-msi is getting held up because the maintainer understands the intent but not why this works (github.com)

Not to throw shade, just wishing that somebody here can understand. Whenever an input is reasonably long, an analyzing function will crash, and this PR aims to fix that with a mechanism that contradicts the maintainer's understanding while a similar C implementation does not need this fix. Clearly, the maintainer has not heard a...

Corbin,

There is no evidence that any human understands computers.

Corbin,

Good work! This is a good start. These sorts of explanations build everybody’s understanding, including your own.

I’d like to mention E-order execution as a weaker alternative to seq. The idea is that sequential operations can be carried out in parallel if they don’t interact with each other’s memory locations. One way to do this is transactional memory, which has been implemented in hardware but is more common in software, hence software transactional memory or STM. That’s how Haskell does it. Another way is with message passing, as in Erlang or Elixir; each tiny process has its own private memory.

The name “E-order” comes from E, which had the ability to start multiple event loops. Erlang and Haskell only have one event loop, but E could give each loop its own private memory and CPU time; think thread-per-loop, but thousands of userspace loops scheduled onto a few threads with thread-per-core scheduling. In E and its descendants, each loop is called a “vat” and holds roughly one of your Frame Trees per vat. A vat takes “turns” and each turn is roughly like executing one of your Tree Branches. The trick is that a vat turn can use a native call stack since it is only one branch of the tree, and is amenable to standard JIT techniques.

Corbin,

A PDF is available here, and analysis from Colyer 2016 is good.

This paper is fascinating in terms of ethnography. Consider: the paper mentions “branch” or “branches” dozens of times, but only says “tree” four times, and every instance is in the phrase “working tree”. The paper never mentions “blob” or “blobs”, “DAG” or “graph” or “poset”. The authors either chose to omit git’s data model, or they don’t know about it. The implication is that the UX and UI don’t reflect the data model, I suppose, but it is a very curious omission.

Now, contrast this with Git’s documentation. When sysadmins teach git, we focus on the data first. git is a kind of database which stores four different flavors of object, and the git UI is merely a collection of commands for programmatically manipulating the database. All of the various UX is purpose-built, on a per-command basis, for development workflows. New commands can be implemented as plain UINX-style executable scripts in any language.

In summary, this paper looks at git as a version-control product, while its developers and users look at git as a version-control framework.

There was a followup paper from a few years later, also with Colyer 2016 analysis; this paper has too many glaring defects to discuss here.

On a personal note, I saw this and am happy to note that science has marched along:

We plan to extend our notation to make it more expressive in the future, but are cognizant of the fact that diagrammatic syntaxes for first order logic have a long and troubled history.

Not long after this paper, ontology logs were figured out, which can be made as expressive as needed for the case of relations; see Patterson 2017.

Corbin,

I agree that the UX of existing git commands is not great. They evolved in a particular insular environment – Linux kernel development with heavy mailing-list usage and large multi-headed merges, with occasional pull requests and manual integration testing.

Check out my top-level comment for a link to git’s data model. A data-first approach with blob, tree, commit, and tag can be enlightening. The on-disk format tries to balance integrity, easy manipulation, disk space, and incremental updates; it’s also weakly monoidal, enabling distributed development. Look up the history of Bitkeeper and git; this is “a version control system [designed] from the ground up with documented architecture from the start”! And there are many non-C implementations as a result, like pure-Python dulwich.

Corbin,

The original workflow, which is still in use today, is git-blame followed by git-show: look up candidate commits, then examine their history individually. This can be accelerated with a GUI; e.g. GitHub and GitLab support blame-style views.

Corbin,

Congratulations on taking a step towards self-hosting and meta-circular compilation. ASDL is a great intermediate meta-language and it can be used to abstract any sort of Builder or Command workflow. This is sometimes called a “narrow waist”; ASDL replaces ad-hoc AST-builder objects with a unified protocol and interface.

For example, I encoded Monte into ASDL while rewriting the compiler in Monte, as part of a self-hosting effort. I also wrote a module which parses ASDL and emits Monte modules, including AST-building tools at runtime. Monte supports E-style quasiliterals, including source-code literals. This let our compiler directly consume ASDL files and emit Monte source code. Going beyond the compiler, this allowed me to encode UI elements and widgets as ASTs and use the Command pattern to write widget-handling objects.

Corbin,

Your code looked alright. Working in C is a risky chore. You’re early in your journey and I think it’s good to get a taste of many of the traditional techniques before turning towards newer fancier algorithms.

“Haskell and OCaml”, hm? The two concepts you need to internalize are katamorphisms and paramorphisms. You may have heard of “recursion schemes”; these two schemes are the ones available on any tree. Fundamentally, most of a compiler is tree-to-tree transformations (nanopasses), and they are expressible as one of two forms:

  • Katamorphism: The leaves of each node are transformed before the branch (bottom-up)
  • Paramorphism: The leaves are transformed after/during the branch transformation (top-down)

If you look again at my AST builder builder, you’ll see .run() and .walk() methods, implementing the recursion for any katamorphism or paramorphism respectively. In Haskell, these are called Traversable types. This is a pun in Monte, where .run() is also the default method; the syntax makes it easy to perform a katamorphism by passing a tree-traversing object directly to the AST.

Your types are alright, but you’ll want to pass a generic parameter through them, turning them into a valid Functor in Haskell or a generic module in OCaml. This is a basic defense against the “AST Decoration Problem”, AKA the “AST Typing Problem”. As you add features to your compiler, you’ll realize why these are necessary.

Corbin, (edited )

Either pick a build image that doesn’t have /homeless-shelter, like nixos/nix, or remove it with something like:


<span style="color:#323232;"># UNTESTED
</span><span style="color:#323232;">RUN rmdir /homeless-shelter
</span>

The root cause is that your build filesystem is dirty. When Nix sandboxes a build, it runs the builders as nobody, a permissionless user with no home directory. On Linux, users with no home directory get their $HOME set to /homeless-shelter, and Nix relies on this directory not existing.

Corbin,

Microsoft is no longer able to outcompete the Free Software commons. That’s all.

You might want to re-read the thread and think about how you sound, by the way. You’re coming off as a concern troll, not as a member of the Free Software community.

Corbin, (edited )

The nix-shell advice is cogent enough that somebody might use it. Instead of a bare bash script, consider direnv, which scopes each shell session to a directory and has upstream builtin Nix support.

Edit: When laning with two mages in bot, focus on harassing enemy champions instead of farming. For Veigar, line up each Q to hit a minion and a champion, combining the farm and harass into a single action. Do not be aggressive in the early game; AP takes time to build. The deaths at 13min and 15min were due to aggression, and by 18min it looks like tilt. Ward more; at 22min, Veigar is alone in lane with no visibility into nearby bushes, and 4/5 enemy champions can burst him down if he misses his cage. Rod of Ages is a good call, but an Hourglass would be an even better call, as it can nullify damage from Ahri and Zed combos. Finally, don’t do fadeaway ults at the end of Veigar combos; it feels satisfying, but isn’t as much damage as carefully-timed Q.

'Don't parse markup languages with Regex' is an annoying trollpost and it should die... right?

Look 0 of my work involves HTML, well maybe 1-2 percent does; however, about 60% of my work involves regular expressions, grammar, lexical scanning and syntactic parsing, so it still irks me, and will irk me beyond my grave, when people say shit like ‘Don’t parse HTML/Markdown/etc with regex! Use a parser generator!’...

Corbin,

[HTML and Markdown] are not grammatically Type 2 (Chomsky-wise, Context-Free); rather, they are Type 3 (Chomsky-wise, Regular).

This is at least half-wrong, in that HTML is clearly not regular. The proof is simple: HTML structurally embeds a language of balanced parentheses (a Dyck language), and such languages are context-free and not regular. I don’t know Markdown well and there are several flavors; it’s quite possible that some flavors are regular. However, traditional Markdown embeds HTML, so if HTML is not regular than neither is Markdown.

I once did a syntax-directed translation of Markdown to HTML in AWK!

Sure. The original Markdown implementation was in Perl and operated similarly. However, note that this doesn’t imply that either language is regular, only that a translation is possible assuming the input is valid Markdown. Punting on recognition means that invalid parse trees have undefined translations, presumably at least sometimes generating invalid HTML.

Corbin,

The definition of recursive language can be read on Wikipedia too. Quoting from that page:

All regular, context-free and context-sensitive languages are recursive.

But not all recursive languages are regular. As for “recursive”, it’s a historical term that might not be sufficiently descriptive; today, I’d usually call such languages decidable, to emphasize that they are languages which we can write (always-halting) computer programs to detect.

I created rcp, an OSC52 copy tool for your remote server (codeberg.org)

I have been searching for a simple way to copy loads of text from remote servers for a while. This includes files, but is sometimes also only multiple lines from stdout of a program. Oftentimes this is kinda hard to do in terminal emulators, so I wrote a very small program to copy text via Operating System Commands....

Corbin,

In addition to the sibling comment, note that reproducible build systems from Docker to Nix require a lockfile in order to be reproducible, and if you don’t provide one, then somebody downstream will provide it instead. By checking in Cargo.lock, you ensure control over the precise versions of your dependencies for all downstream users.

Corbin,

Yeah, this list of sites is making me think of asking for a book by loudly asking a library, a series of coffeeshops, a chud microbrewery, and an 11-year-old bully. Try quietly reading in the library first, I guess.

Corbin,

If it’s on Stack Exchange, you can help us keep the community decent by assuming good faith and being patient with newcomers. Yes, it’s frustrating. And yeah, sometimes, it’s basically impossible to avoid sarcasm and scorn, just like how HN sometimes needs to be sneered at, but we can still strive for a maximum of civility.

If all else fails, just remember: you’re not on Philosophy SE or any of the religious communities, it’s just a computer question, and it can be answered without devolving into an opinion war. Pat yourself on the back for being a “schmott guy!” and write a polite answer that hopefully the newbies will grok. Be respectful of plural perspectives; it’s a feature that a question may have multiple well-liked answers.

Corbin,

You can help by gaining points on multiple SE sites and participating in elections. Please vote!

Corbin,

The classic example is the N+1 query pattern, where the number of generated queries is linear in the number of rows returned by the first query.

Corbin,

There are Python compilers which do AST analysis instead of bytecode analysis, particularly Nuitka and Shed Skin. They aren’t very good, but it’s not clear whether that’s because working with the AST is somehow harder than working with the bytecode. RPython doesn’t compile all bytecodes; most generator/coroutine functionality is missing, for example.

Think of type-checking as a syntactic analysis; this is how it avoids Rice’s theorem. Like you say, we can annotate names with type information, and we can do it without evaluating the code. The main problem here is that Python’s semantics don’t require these annotations to enforce the types of values; you may be interested in E, a research language from the 90s which did enforce type annotations on otherwise-untyped names. In Python, this doesn’t error:


<span style="color:#323232;">>>> x :int = "42"
</span>

But in E, this does error:


<span style="color:#323232;">? def x :int := "42"
</span><span style="color:#323232;"># problem: <ClassCastException: String doesn't coerce to an int>
</span>

Sadly, E is long dead, and something of an archeological artifact rather than a usable system. But it may be inspiring to your future efforts, especially since it sounds like you’re learning how to build compilers. (I helped write Monte, a language which blends E and Python; it is also dead, but was more enjoyable than E.)

Corbin,

I copied and pasted from the terminal to ensure that I formatted the error message properly. The question-mark prompt is what E used, or at least E-on-Java. Monte used a little Unicode mountain:


<span style="color:#323232;">⛰  currentProcess.getProcessID() :Int
</span><span style="color:#323232;">Result: 2805098
</span><span style="color:#323232;">⛰  def x :Int := "42"
</span><span style="color:#323232;">Exception: "42" does not conform to Int
</span><span style="color:#323232;">⛰  "42" :Int
</span><span style="color:#323232;">Exception: "42" does not conform to Int
</span>

I can’t really give a reason other than that the prompt characters on Unix-like systems are arbitrary and most REPL libraries allow them to be customized.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • thenastyranch
  • magazineikmin
  • cubers
  • ngwrru68w68
  • rosin
  • Youngstown
  • khanakhh
  • osvaldo12
  • slotface
  • InstantRegret
  • modclub
  • kavyap
  • DreamBathrooms
  • megavids
  • ethstaker
  • Durango
  • normalnudes
  • everett
  • cisconetworking
  • mdbf
  • tester
  • GTA5RPClips
  • tacticalgear
  • Leos
  • anitta
  • provamag3
  • lostlight
  • All magazines