drahardja,
@drahardja@sfba.social avatar

The Stack Overflow rugpull is another data point in my head which discourages me from contributing any content to a hoard owned by a corporation.

I’m hoping that ActivityPub will one day enable SO-style knowledge bases in which the individual nuggets of content are owned by independent servers and cannot be purchased by anyone.

#stackOverflow #rugpull #fediverse

https://www.tomshardware.com/tech-industry/artificial-intelligence/stack-overflow-bans-users-en-masse-for-rebelling-against-openai-partnership-users-banned-for-deleting-answers-to-prevent-them-being-used-to-train-chatgpt

mdione,
@mdione@en.osm.town avatar

@drahardja I come from the ages where communication was either IRC or mail (but not further back; I never touched usenet).

I never liked SO and siblings. That idea of competing for the best answer feels, well, competitive (in the bad meaning of the word). What I remember of email (not so much IRC) was the discussion and construction of the solution by all the participants.

1/N

mdione,
@mdione@en.osm.town avatar

@drahardja They have mostly been replaced by discourse, of which I'm not quite a fan because it doesn't have threading. But it has a marvelous quoting system that I would call better than email, and it's inline, no top posting! So maybe I could be a fan, hmmm...

2/N

mdione,
@mdione@en.osm.town avatar

@drahardja Anyways, I just hope we come back a bit to those places. I still do much IRC, and discourse, and then slack, discord¹, github, and so many other platforms, depending on the project. Now even to me mail seems like a nuisance. Mastodon is a hit an miss.

¹ that name alone makes me every time pause and ponder if I really want that answer..

drahardja,
@drahardja@sfba.social avatar

@mdione I was a huge fan of Usenet until the spambots came and ruined everything. Discord is impenetrable and un-archiveable, and ripe for another rugpull, so I wouldn’t put anything there that has long-term value. Corporate Slack is actually acceptable for corporate institutional knowledge, but not much more than that.

I liked SO because it’s basically a searchable, self-organizing wiki that bubbles things up and down depending on their usefulness. I haven’t seen anything else quite like it. The gamification only matters if you’re into high scores and that kind of thing—I basically ignored it.

datarama,
@datarama@hachyderm.io avatar

@drahardja But if we have an open knowledge base, AI companies will just take that for free.

drahardja,
@drahardja@sfba.social avatar

@datarama I actually prefer that because then it’s just theft instead of a sale. At least there is some hope of litigation later.

datarama,
@datarama@hachyderm.io avatar

@drahardja That depends on whether it's illegal at all, and at this point, a lot of jurisdictions haven't actually made that decision.

(I think "copyright doesn't apply if you're a corporation with enough computational resources, but it does if you're just a person" is the most ridiculous possible interpretation ... but I also have to realize that judges and politicians can be bought.)

datarama,
@datarama@hachyderm.io avatar

@drahardja In the EU, the situation is currently that AI training systems have to "respect a machine-readable opt-out" if training commercial systems (copyright explicitly doesn't cover training research systems, which is probably why eg. Stability funded a "non-profit research lab" to do all the actual data gathering for them).

But the only current way to make such an opt-out is a flaky W3C proposal which is easy to circumvent (https://www.w3.org/community/reports/tdmrep/CG-FINAL-tdmrep-20240202/).

datarama,
@datarama@hachyderm.io avatar

@drahardja Let's say, for example, that I self-publish some free code on my website and set up TDMReP to send its machine-readable response to all requests for it.

Some guy takes a copy of my code and puts it on Github. Now TDMReP no longer sends a machine-readable response for the version there. So now it gets shoved into Copilot and Starcoder and all the rest - and none of them broke the law, since I wasn't sending them a "machine-readable opt-out".

datarama,
@datarama@hachyderm.io avatar

@drahardja I'm not sure if there is even any way to mount an actual legal challenge to this, or if we just kinda have to accept that any human creative work in the future is just free training data now (unless we guard it and keep it secret - and then what's the point?)

drahardja,
@drahardja@sfba.social avatar

@datarama OK, say we are no worse off when it’s federated when it comes to AI training. I don’t think I agree, but let’s say the law completely lets creators down, and all data on the Internet is free game.

There are other reasons to federate: replication, no single point of failure, no commercialization of other sorts (e.g. user profiling, targeted ads), better moderation, better data portability, etc.

It prevents a corporation from gaining control over what individuals choose to do with their creations.

drahardja,
@drahardja@sfba.social avatar

@datarama On the legal front, I think there will be a landmark case soon between two moneyed parties—the NYT lawsuit is one such case—that determines whether people have a right to exclude their data from AI training, or at least charge a fee for its use. I think there is a pretty good chance that the law will fall on the side of the creator.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • stackoverflow
  • ngwrru68w68
  • rosin
  • modclub
  • Youngstown
  • khanakhh
  • Durango
  • slotface
  • mdbf
  • cubers
  • GTA5RPClips
  • kavyap
  • DreamBathrooms
  • InstantRegret
  • magazineikmin
  • megavids
  • osvaldo12
  • tester
  • tacticalgear
  • ethstaker
  • Leos
  • thenastyranch
  • everett
  • normalnudes
  • anitta
  • provamag3
  • cisconetworking
  • JUstTest
  • lostlight
  • All magazines