netbsd,
@netbsd@mastodon.sdf.org avatar

New development policy: code generated by a large language model or similar technology (e.g. ChatGPT, GitHub Copilot) is presumed to be tainted (i.e. of unclear copyright, not fitting NetBSD's licensing goals) and cannot be committed to NetBSD.

https://www.NetBSD.org/developers/commit-guidelines.html

BrodieOnLinux,
@BrodieOnLinux@linuxrocks.online avatar

@netbsd Is it intentional that AI generated documentation is not mentioned or was that not thought of during the update?

netbsd,
@netbsd@mastodon.sdf.org avatar

@BrodieOnLinux The contract developers have historically signed uses the "tainted code" wording.

BrodieOnLinux,
@BrodieOnLinux@linuxrocks.online avatar

@netbsd If I'm understanding your reply correctly then it also applies to documentation

eschaton,
@eschaton@mastodon.social avatar

@netbsd Bravo!

andrei,
@andrei@mastodon.sdf.org avatar

@netbsd I don't think this was a necessary policy. I think the code should be reviewed on a case-by-case basis. AI in its current state is mostly an advanced completion tool, and I believe it could improve the productivity of developers significantly.

julienbarnoin,
@julienbarnoin@mastodon.gamedev.place avatar

@andrei @netbsd I have a hard time believing that it would help much. In my experience, actually typing the code and getting the syntax right and whatnot is hardly what takes time, you can type pretty much as fast as an LLM can generate tokens once you know what you want.

The part that actually takes time is understanding the needs correctly, reasoning about a possible solution and its impacts, reflecting on all possible edge cases, etc. No LLM can replace humans at that part.

netbsd,
@netbsd@mastodon.sdf.org avatar

@julienbarnoin @andrei This policy is not about about code quality, it's about copyright.

iwein,
@iwein@mas.to avatar

@netbsd not sure if a specific policy is needed here.

  1. The committing member is already responsible for copyright issues.
  2. Whether code is generated by a technical system, or a natural neural net, doesn't make much difference on it's own in the suspicion of the code.
  3. There are more tainted sources like stack overflow that would beg for a specific policy as well. Clarity and conciseness would suffer.

I think it should be a hiring policy instead.

netbsd,
@netbsd@mastodon.sdf.org avatar

@iwein This is a hiring policy - it's part of the developer contract that all new members of the Foundation are required to sign. Foundation membership is required for commit access.

jokeyrhyme,
@jokeyrhyme@aus.social avatar

@netbsd I wonder how this might apply to models that are trained only on permissively-licensed (BSD) code, assuming the output was carefully reviewed by a human and meets the quality bar? https://docs.tabnine.com/main/welcome/readme/ai-models

netbsd,
@netbsd@mastodon.sdf.org avatar

@jokeyrhyme That would require review and approval by core@.

rzeta0,
@rzeta0@mastodon.social avatar

@netbsd I wonder if openbsd will do this too?

netbsd,
@netbsd@mastodon.sdf.org avatar

@rzeta0 Us not being the boss of them is kind of "the point"

mark,
@mark@mastodon.fixermark.com avatar

@netbsd Figuring out code is tainted by use of copyrighted code from another source is as straightforward as string-matching, maybe some fuzzy matching.

How would one identify code generated with the assistance of an LLM if the contributor doesn't admit to doing that?

netbsd,
@netbsd@mastodon.sdf.org avatar

@mark This is one of the sets of rules that every person with commit access has to follow. Becoming a committer is not easy, it requires joining the Foundation and signing various contracts that place the burden of responsibility on the member. It's a fairly reasonable assumption that we should be able to trust our members, and if not they shouldn't be members.

daniel_collin,
@daniel_collin@mastodon.gamedev.place avatar

@asmodai There is no way they can verify that tho

ParadeGrotesque,
@ParadeGrotesque@mastodon.sdf.org avatar

@daniel_collin @asmodai

A minimal cursory glance on the code should be enough. If not, a bit of fuzzing can be helpful in my opinion.

daniel_collin,
@daniel_collin@mastodon.gamedev.place avatar

@ParadeGrotesque @asmodai You would have to automate it in that case. I doubt reviewers want to sit and prove that some code was generated by an LLM. Also it could be just a few lines that has been generated, no way to prove that and given the enough context of the code the LLM and a human may come up with exactly the same result. Would the code by denied in that case then?

daniel_collin,
@daniel_collin@mastodon.gamedev.place avatar

@ParadeGrotesque @asmodai If you did something like telling ChatGPT to "generate me a bubble sort in C" and did a copy/paste of the code is likely easy to spot, but for more subtle cases it will be quite hard imo.

ParadeGrotesque,
@ParadeGrotesque@mastodon.sdf.org avatar

@daniel_collin

True but on the other hand, something tells me NetBSD developpers are unlikely to rely too much on ChatGPT - a portable system is not something I see LLM as being able to produce code for.

The NetBSD code in general is reportedly of a high standard as dev place an emphasis on correctness. I think anything generated by ChatGPT would stand out like a sore thumb.

To be clear: NONE of my code will ever make it into NetBSD either! 🤓

@asmodai

asmodai,
@asmodai@mastodon.social avatar

@daniel_collin True, but at least having a policy is something that can be fallen back on in dubious cases?

daniel_collin,
@daniel_collin@mastodon.gamedev.place avatar

@asmodai Still think it would be hard. Sure if you can "backwards" prove that some specific input generates exactly some code that someone commits without any changes then maybe, but usually you don't write code that way.

You implement something and then you change stuff to what you want it to do. In general using LLMs to do algorithms is a bad idea.

But using it for generating boilerplate (i.e repeating code patterns) and test code is very useful and doesn't affect the "real" code.

netbsd,
@netbsd@mastodon.sdf.org avatar

@daniel_collin @asmodai This is one of the sets of rules that every person with commit access has to follow. Becoming a committer is not easy, it requires joining the Foundation and signing various contracts that place the burden of responsibility on the member. It's a fairly reasonable assumption that we should be able to trust our members, and if not they shouldn't be members.

peacememories,
@peacememories@chaos.social avatar

@daniel_collin @asmodai ability of perfect enforcement is not a prerequisite for a policy. look at... i dunno... law

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • DreamBathrooms
  • magazineikmin
  • everett
  • InstantRegret
  • rosin
  • Youngstown
  • slotface
  • love
  • khanakhh
  • kavyap
  • tacticalgear
  • GTA5RPClips
  • thenastyranch
  • modclub
  • megavids
  • mdbf
  • normalnudes
  • Durango
  • ethstaker
  • osvaldo12
  • cubers
  • ngwrru68w68
  • tester
  • anitta
  • cisconetworking
  • Leos
  • provamag3
  • JUstTest
  • All magazines