Stack Overflow bans users en masse for rebelling against OpenAI partnership — users banned for deleting answers to prevent them being used to train ChatGPT

Image

Image alternative text

doodledup, 6 days ago

It will not make a difference. The internet is free and open by design. You can always scrape the internet any time. A partnership will do nothing but make it a little bit more convenient for them.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

cordlesslamp, 1 month ago

So they pulled a “reddit”?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ahriboy, 1 month ago

And then Stack Overflow will go the same way Digg did.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

JudahBenHur, 1 month ago

god damn- I went over to Digg yesterday to see what its been like and I shit you not, it is links to reddit threads and instagram posts

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

sushibowl, 1 month ago

Same as any other social media. Reddit has a lot of twitter, Tumblr and 4chan screenshots, TikTok videos, etc. Lemmy is not much different.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

sirboozebum, 1 month ago

These companies don’t realise their most engaged users generate a disproportionate amount of their content.

They will just go to their own spaces.

I think this a good thing in the long run, the internet will become decentralised again.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

geneva_convenience, 1 month ago

CEO will have his bag and be gone by then.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pound_heap, 1 month ago

Well, reddit is doing fine so far. Shareholders are happy

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

floofloof, 1 month ago

I don’t know. It feels a bit like “When I quit my employer will realize how much they depended on me.” The realization tends to be on the other side.

But while SO may keep functioning fine it would be great if this caused other places to spring up as well. Reddit and X/Twitter are still there but I’m glad we have the fediverse.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Corkyskog, 1 month ago

The company’s get hit hard by unplanned vacancies. It won’t take them down, but it can cost them buckets of money in either expenses, lost revenue or both. The thing is, the people that left will never know that, there coworkers will never see it, only people in finance and budget will know how to quantify the impact.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

sirboozebum, 17 days ago

Individuals leaving don’t have an immediate impact but entire groups of people?

People can see how that worked out for Boeing when many of their experienced engineers and quality inspectors left.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

inset, 1 month ago

I hope it doesn’t end up like it did on Reddit, where all those protests did not result in anything at all.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

nxdefiant, 1 month ago

Lemmy’s bigger than ever, and that’s a direct consequence of reddit’s enshittification, so there’s that at least.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bitchkat, 1 month ago

Maybe we should replace Stack Overflow with another site where experts can exchange information? We can call it “Experts Exchange”.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

deddit, 1 month ago

codidact … Stack overflow had a mass exodus of mods a 2-3 years ago and a some of them made codidact.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fruitycoder, 1 month ago

Any discussion on making it ActivityPub enabled?

I didn’t see any, but would be curious if anyone else had.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bitfucker, 1 month ago

Expert Sex Change?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

skulblaka, 1 month ago

Also a market there. Especially among programmers. You might be onto something.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

lord_ryvan, 20 days ago

Among Rust devs? Absolutely!

https://ttrpg.network/pictrs/image/5494022d-9ff3-4aef-b97f-ba14e6eca24d.png

https://ttrpg.network/pictrs/image/cc4be6fe-a873-496e-8eb8-29641214ca2d.jpeg

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

afraid_of_zombies, 1 month ago

I agree with your idea. I will be launching a website where users can share content. It will be free once knowledge should be free and we will make money by selling data…umm selling user data…umm selling T-shirts I guess. That should be enough to keep the servers running.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Reddfugee42, 1 month ago

Yes, next to Pen Island

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ShadowGlider, 1 month ago

deleted_by_author

Loading...

yokonzo, 1 month ago

I mean that’s just been a schoolyard joke for ages

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

the_crotch, 1 month ago

You don’t want that shit done by an amateur

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

splatsune, 1 month ago

Maybe there I can ask where to find a good pen supplier.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Legend, 1 month ago

Lemmy could be used as a stack overflow alt also Lemmy is shitification repelent by design .

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

unreasonabro, 1 month ago

See, this is why we can’t have nice things. Money fucks it up, every time. Fuck money, it’s a shitty backwards idea. We can do better than this.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rottingleaf, 1 month ago

You can be killed with steel, which has a lot of other implications on what you do in order to avoid getting killed with steel.

Does steel fuck it all up?

Centralization is a shitty backwards idea. But you have to be very conscious of yourself and your instincts to neuter the part that tells you that it’s not to understand it.

Distributivism minus Catholicism is just so good. I always return to it when I give up on trying to find future in some other political ideology.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Hamartia, 1 month ago

List of Distributist parties in the UK:

National Distributist Party

British National Party

National Front

Hmmm, maybe the Catholic part isn’t the only part worth reviewing.

Also worth noting that the Conservative Party’s ‘Big Society’ schtick in 2010 was wrapped in the trappings of distributism.

Not that all this diminishes it entirely but it does seem to be an entry drug for exploitation by the right.

I gotta hold my hand up and state that I am not read up on it at all, so happy to be corrected. But my impression is that Pope Leo XIII’s conception was to reduce secular power so as to leave a void for the church to fill. And it’s the potential exploitation of that void that attracts the far right too.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rottingleaf, 1 month ago

but it does seem to be an entry drug for exploitation by the right.

Well, it is a right ideology. It can be that, of course.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

0x0, 1 month ago

Anarchosyndicalism ftw.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rottingleaf, 1 month ago

Of leftist ideologies it’s the best one, but not as beautiful and overarching as distributivism.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

patatahooligan, 1 month ago

This has nothing to do with centralization. AI companies are already scraping the web for everything useful. If you took the content from SO and split it into 1000 federated sites, it would still end up in a AI model. Decentralization would only help if we ever manage to hold the AI companies accountable for the en masse copyright violations they base their industry on.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rottingleaf, 1 month ago

This has everything to do with centralization, just not with the one small context for it which you picked.

With real decentralization in place market mechanisms work.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Aceticon, 1 month ago

Copyright is an artificial, government given Monopoly.

Market Mechanisms don’t work when faced with a Monopoly or work badly in situations distorted by the presence of a Monopoly (which is more this case, since Stack Overflow has a monopoly in the reproduction of each post in that website but the same user could post the same answer elsewhere thus creating an equivalent work).

Pretty much in every situation where Intellectual Property is involved you see the market failing miserably: just notice the current situation with streaming services which would be completelly different if there was no copyright and hence no possibility of exclusivity of distribution of any titles (and hence streaming services would have to compete in terms of quality of service).

The idea that the Free Market is something that works everywhere (or even in most cases) is Politically-driven Magic thinking, not Economics.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

floofloof, 1 month ago

Market forces lead to the creation of large corporations that then shut down market forces and undermine fair markets. Once a few big corporations dominate they coordinate their behavior and prices and shut down any new players entering the market. Regulation can counter it to a point, but once the corporations are wealthy enough to dominate government regulation also fails. Right wingers hasten the process by opposing regulation, and have no good answer to how to prevent markets collapsing into monopolies or cartels. I’m not sure anyone has a good answer to that in a capitalist system.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rottingleaf, 1 month ago

You are not arguing with me. Not reading comments before answering them is disrespectful.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Aceticon, 1 month ago (edited 1 month ago)

This has everything to do with centralization, just not with the one small context for it which you picked.

With real decentralization in place market mechanisms work.

Monopoly situations along with market mechanisms invariably result in centralization (“monopoly” comes from the Greek word for “right of exclusive sale”), hence market mechanism won’t “work” in the sense you mean it in such a scenario, as I explained.

Your argument is circular because it’s like saying that it will work as long as it creates the conditions to make itself work (which is the same as saying “as long as it works”).

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rottingleaf, 1 month ago

Decentralization and distribution should be enforced, yes.

By, for example, institutionalized resistance to anything like IP law, to regulations and certifications allowing bigger fish to call those who can’t afford them, and at the same time by maintaining regulations against obvious fraud.

It’s not a circular argument, you’re just not paying attention.

The friendliness of political systems to decentralization doesn’t correlate much with their alignment in terms of left\right or even authoritarian\libertarian. So in my opinion this should be a third dimension on that political compass everybody’s gotten tired of seeing. And there are many other dimensions to add then, so useless.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

JackbyDev, 1 month ago

You realize that there have been multiple websites scraped, right? So decentralizing doesn’t solve this issue in particular. Especially when federated sites like Lemmy provide a view of the entire fediverse (more or less).

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rottingleaf, 1 month ago

This is orthogonal to what I’m talking about. I don’t see scraping as a problem.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

JackbyDev, 1 month ago

The person you were replying to was talking about scraping.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rottingleaf, 1 month ago

They were answering me and I wasn’t talking about scraping. Can you arrogant dumb commie shits looking for conflict just fuck off?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

JackbyDev, 1 month ago

Why are you calling me an arrogant dumb commie shit? Relax. It’s not that serious.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rottingleaf, 1 month ago

A reaction developed because of there often being some “eat the rich” types thinking they don’t need brain because they have taken the right position.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Jakeroxs, 1 month ago

Can you explain how reddit comments or stack overflow answers are “copyright infringement”?

Doesn’t seem relevant to the specific problem this post is about.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

patatahooligan, 1 month ago

Just because something is available to view online does not mean you can do anything you want with it. Most content is automatically protected by copyright. You can use it in ways that would otherwise by illegal only if you are explicitly granted permission to do so.

Specifically, Stack Overflow licenses any content you contribute under the CC-BY-SA 4.0 (older content is covered by other licenses that I omit for simplicity). If you read the license you will note two restrictions: attribution and “share-alike”. So if you take someone’s answer, including the code snippets, and include it in something you make, even if you change it to an extent, you have to attribute it to the original source and you have to share it with the same license. You could theoretically mirror the entire SO site’s content, as long as you used the same licenses for all of it.

So far AI companies have simply scraped everything and argued that they don’t have to respect the original license. They argue that it is “fair use” because AI is “transformative use”. If you look at the historical usage of “transformative use” in copyright cases, their case is kind of bullshit actually. But regardless of whether it will hold up in court (and whether it should hold up in court), the reality is that AI companies are going to use everybody’s content in ways that they have not been given permission to do so.

So for now it doesn’t matter whether our content is centralized or federated. It doesn’t matter whether SO has a deal with OpeanAI or not. SO content was almost certainly already used for ChatGPT. If you split it into 100s of small sites on the fediverse it would still be part of ChatGPT. As long as it’s easy to access, they will use it. Allegedly they also use torrents for input data so even if it’s not publicly viewable it’s not safe. If/when AI data sourcing is regulated and the “transformative use” argument fails in court and if the fines are big enough for the regulation to actually work, then sure the situation described in the OP will matter. But we’ll have to see if that ever happens. I’m not holding my breath, honestly.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

JackbyDev, 1 month ago

The irony is that folks complain about stuff like Discord partly because it cannot be scraped by search engines but that would also protect it from being scraped by AI tools.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

the_toast_is_gone, 1 month ago

Until Discord either starts selling data to OpenAI or they start scraping data from/similar to sites like spy.pet .

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

JackbyDev, 1 month ago

Believe me, I’m not saying Discord is the bastion of hope for data protection or anything like that lol.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pete_the_cat, 1 month ago

Someone comes up with something good: look what I made, we can use this to better humanity!

Corporations: How can we make money off of this?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Colonel_Panic_, 1 month ago

Hear me out. Bottle caps.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

lord_ryvan, 20 days ago

Nah, I can’t imagine the Fallout that would cause

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

unreasonabro, 1 month ago

'Nuff said!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Daerun, 1 month ago

Good to know that stackoverflow will not be a trustable place to find solutuons anymore.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

FJW, 1 month ago

Frankly, the solution here isn’t vandalism, it’s setting up a competing side and copying the content over. The license of stackoverflow makes that explicitly legal. Anything else is just playing around and hoping that a company acts against its own interests, which has rarely ever worked before.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

HelloHotel, 1 month ago

The license of stackoverflow makes that explicitly legal

How and why is it illegal (I will take down my post about vandlism until I discuss this.)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

FJW, 1 month ago

I’m not saying vandalism is illegal. I’m say that it borders on immoral and that there is a better, more radical (and thus effective) alternative that one might expect to be illegal but in fact isn’t.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

HelloHotel, 1 month ago

My post was mostly to just insert invisable marks like   to your answers to screw over any machine that is sensitive to unicode.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

old_machine_breaking_apart, 1 month ago

Maybe we need a technical questions and answers siteon the fediverse!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

kalpol, 1 month ago

Not gonna stop your knowledge being fed to an AI.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ultra, 1 month ago

what about instances that need you to be logged in to view posts and require authorized requests for federation?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

kalpol, 1 month ago

All it needs is an account to access troves of training data?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ultra, 1 month ago

That should be manually approved

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Saledovil, 1 month ago

How restrictive do you want to be with the accounts? If you’re too restrictive, there won’t be enough users. If you’re not restrictive enough, the data will be used for AI training.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Aux, 1 month ago

That defeats the purpose of a knowledge base. The whole reason why everyone is using SO is that you don’t need an account to access it and it’s fully indexed by Google.

The real question is why the fuck are people ok with Google indexing SO and not OpenAI? Doesn’t make any fucking sense.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

irreticent, 1 month ago

The real question is why the fuck are people ok with Google indexing SO and not OpenAI? Doesn’t make any fucking sense.

Because Google is free and OpenAI isn’t. It’s one thing to take free content, index it, then allow anyone to access that index. It’s another thing when you take free content, index it, then hide that index behind a paywall.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Aux, 1 month ago

Are you sure? Because Google is not free at all, you’re paying for it through privacy invasion and ads. While ChatGPT is actually free to use for end users - no ads, nothing.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jnk, 1 month ago

The price difference is that google steals your data. That’s it. OpenAI steals data, ask for money to use most of their models, and buy even more data from other companies stealing user data (like google and SO). Also indexing web pages is not even the “stealing” part of google, it’s just not comparable.

Yes, training AI on user data for free then selling the end product is a reasonable thing to be concerned about. It’d be different if the product was free or the data was sold to them with user consent.

SO has announced a subscription-based service trained on user data for free, and not only there’s not even opt-out, they’re mass-banning users for trying to “opt-out” manually. Tell me one thing here that’s not completely fucked up.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Aux, 1 month ago

But it’s free. Unlike Google.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

irreticent, 1 month ago

openai.com/api/pricing/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Aux, 1 month ago

No, it’s free chatgpt.com

As your link is for custom enterprise solutions, it’s worth noting that Google has the same shit which also costs money cloud.google.com/pricing/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Zacryon, 1 month ago

It’s “freemium”, not free. There is a difference. You can’t use ChatGPT 4 without paying as well as the API. Also, you are limited in the number of prompts you can make per hour before you are put on pause and asked to pay.

Search engines like Ecosia, DuckDuckGo, etc. don’t ask you for money. Regardless how intensively you use it. (They might come with other drawbacks though like Google with privacy, environment, ethical principles, …)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Aux, 1 month ago

It;s more free than Google.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Zacryon, 1 month ago

I’ve never been asked to pay for using one of the aforementioned search engines. I have been asked to pay for OpenAI products.

So I don’t see how you come to that conclusion.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Aux, 1 month ago

Read the comments

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Zacryon, 1 month ago

The ones where you just claim that despite it being not true or which ones do you mean?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Aux, 1 month ago

Not true? Ahaha! Good job spreading misinformation!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Zacryon, 1 month ago

Well… as I said. OpenAI asks for money, search engines usually don’t. Ergo, OpenAI is not free. (But freemium.)

Despite claiming that’s not the case, you lack the necessary proof and don’t seem to care about countering my argument with something of substance.

Such a discussion will not be fruitful if you are unwilling to deliver.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Aux, 1 month ago

It’s free, what else do you want?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Zacryon, 1 month ago

That you deliver reasons for why you claim I’m wrong.

It’s freemium, not free. As I said before, OpenAI limits the number of prompts you can make per hour in case you don’t want to pay. Also, using the API or ChatGPT 4 costs money. Users of search engines are usually not asked for money.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

irreticent, 1 month ago

What does Google’s cloud service have to do with what we’re discussing (Google indexing content vs. SO OpenAI doing it)? They’re not even similar services.

Edit: SO -> OpenAI

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Aux, 1 month ago

The fuck are you talking about?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

old_machine_breaking_apart, 1 month ago

Is there an actual way to stop it? I don’t think so. At least, moving to the fediverse would stop any particular corporation from having the monopoly of it, prevent reddit-like abuse of power, would give users more power, among a few other things.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bamfic, 1 month ago

Nothing stopping them from scraping that too

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

FaceDeer, 1 month ago

This sort of thing is so self-sabotaging. The website already has your comment, and a license to use it. By deleting your stuff from the web you only ensure that the AI is definitely going to be the better resource to go to for answers.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Rolando, 1 month ago

I’m not sure about that… in Europe don’t you have the right to insist that a website no longer use your content?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Z3k3, 1 month ago

That’s an interesting point. I winder how llms handle gdpr would it be like having a tiny piece of your brain cut out

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

000, 1 month ago

Not when you’ve agreed to a terms of service that hands over ownership of your content to Stack Overflow, leaving you merely licensed to use your own content.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

veniasilente, 1 month ago

Bets are strong such tos are not legally enforceable.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

randompasta, 1 month ago

That’s why I’m not going to bother contributing to future content.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

NaibofTabr, 1 month ago

I need to start paywalling my comments.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

BraveLittleToaster, 1 month ago

Also backups and deleted flags. Whatever comment you submitted is likely backed up already and even if you click the delete button you’re likely only just changing a flag.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

gencha, 1 month ago

I feel like a lot of people don’t understand the most basic things about the site. Any user with enough internet points can see deleted posts.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

stoly, 1 month ago

Edit and save then delete.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

stevedidwhat_infosec, 1 month ago

Instead of solely deleting content, what if authors had instead moved their content/answers to something self-owned? Can SO even claim ownership legally of the content on their site? Seems iffy in my own, ignorant take.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

matjoeman, 1 month ago

They can. It’s in the TOS when you make your account. They own everything you post to the site.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

stevedidwhat_infosec, 1 month ago

Well I suppose in that case, protesting via removal is fine IMO. I think the constructive, next-step would be to create a site where you, the user, own what you post. Does Reddit claim ownership over posts? I wonder what lemmy’s “policies” are and if this would be a good grounds (here) to start building something better than what SO was doing.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Aux, 1 month ago

A SO alternative cannot exist if a user who posted an answer owns it. That defeats the purpose of sharing your knowledge and answering questions as it would mean the person asking the question cannot use your answer.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

stevedidwhat_infosec, 1 month ago

A SO alternative cannot exist if a user who posted an answer owns it. That defeats the purpose of sharing your knowledge and answering questions as it would mean the person asking the question cannot use your answer.

Couldn’t these owners dictate how their creations are used? If you don’t own it, you don’t even get a say.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Aux, 1 month ago

That’s the point of platforms like SO - you give away your knowledge, for free, for everyone, for any use case. If a user can restrict the use of their answers, then it makes no sense for SO to exist. It’s like donating food to a food bank and saying that your food should only go to white people and not black people.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

stevedidwhat_infosec, 1 month ago

I’m not sure I agree with your example - it’s more like giving the owners of the donation the ability to choose WHO they are donating to. That means choosing not to donate to companies that might take your food donation and sell it as damaged goods for example. I wouldn’t want my donation to be used that way. Thats how I see it anyway

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

JackbyDev, 1 month ago

Everything you submit to StackOverflow is licensed under either MIT or CC depending on when you submitted it.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

stevedidwhat_infosec, 1 month ago

So does that mean anyone is allowed to use said content for whatever purposes they’d like? That’d include AI stuff too I think? Interesting twist there, hadn’t thought about it like this yet. Essentially posters would be agreeing to share that data/info publically. No different than someone learning how to code from looking at examples made by their professors or someone else doing the teaching/talking I suppose. Hmm.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

repungnant_canary, 1 month ago

CC (not sure about MIT) virtually always requires attribution, but as GitHub Copilot showed right now open-“media” authors have basically no way of enforcing their rights.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Dkarma, 1 month ago

Probably cuz they gave them away when they open licensed…you know…how it’s supposed to work

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

repungnant_canary, 1 month ago

In most jurisdictions you can’t give away copyright - that’s why CC0 exists. And again most open-source and CC licences require attribution, if you use those licences you have a right to be attributed

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

JackbyDev, 1 month ago

For super permissible licenses like MIT then it’s probably fine. Maybe folks would need to list the training data and all the licenses (since a common requirement of many of even the most permissible licenses is to include a copy of the license).

As far as I know, a court hasn’t ruled on whether clauses like “share alike” or “copy left” (think CC BY-SA or GPL) would require anything special or not allow models. Anyone saying otherwise is just making a best guess. My best guess is (pessimistically) that it won’t do any good because things produced by a machine cannot be copyrighted. But I haven’t done much of a deep dive. I got really interested in the differences between many software licenses a few years back and did some reading but I’m far from an expert.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bitwolf, 1 month ago

So they have to carefully only source the MIT data?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

JackbyDev, 1 month ago

It hasn’t been tested in court so any answer anyone gives is only a best guess.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

lauha, 1 month ago

Regardless of the license (apart perhaps from public domain) it is legally still your copyright, since you produced the content. Pretty sure in EU they cannot prevent you from deleting your content.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

JackbyDev, 1 month ago

But those two licenses give everyone an irrevocable right to do certain things with your content forever and displaying it on a website is one of those things (assuming they follow the other requirements of the license).

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pseudo, 1 month ago

If StackOverflow teach me something, that is that legal jargon about copyright isn’t very efficient again ctrl+C/ctrl+V

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

FJW, 1 month ago

it is legally still your copyright, since you produced the content. Pretty sure in EU they cannot prevent you from deleting your content.

They absolutely can, you gave them an explicit (under most circumstances irrevocable) permission to do so. That’s how contracts work.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

lauha, 1 month ago

Unlike in US, and I cannot speak for all of EU, but at least in Finland a contract cannot take away your legal rights.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

FJW, 1 month ago

You can when it comes to copyright. That’s EU-law and anything else would be such a horrible idea that no country would ever set up a law saying otherwise.

If you could simply revoke copyright licenses you would completely kill any practicality of selling your copyrighted works and it would fully undermine any purpose it served in the first place.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Bell, 1 month ago

Take all you want, it will only take a few hallucinations before no one trusts LLMs to write code or give advice

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

FaceDeer, 1 month ago

Maybe for people who have no clue how to work with an LLM. They don't have to be perfect to still be incredibly valuable, I make use of them all the time and hallucinations aren't a problem if you use the right tools for the job in the right way.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

stonerboner, 1 month ago

This. I use LLM for work, primarily to help create extremely complex nested functions.

I don’t count on LLM’s to create anything new for me, or to provide any data points. I provide the logic, and explain exactly what I want in the end.

I take a process which normally takes 45 minutes daily, test it once, and now I have reclaimed 43 extra minutes of my time each day.

It’s easy and safe to test before I apply it to real data.

It’s missed the mark a few times as I learned how to properly work with it, but now I’m consistently getting good results.

Other use cases are up for debate, but I agree when used properly hallucinations are not much of a problem. When I see people complain about them, that tells me they’re using the tool to generate data, which of course is stupid.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

aniki, 1 month ago

This is how I use it as well. I also have it write tests with the code I give it.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

VirtualOdour, 1 month ago

Yeah, it’s an obvious sign they’re either not coders at all or don’t understand the tech at all.

Asking it direct questions or to construct functions with given inputs and outputs can save hours, especially with things that disrupt the main flow of coding - I don’t want to empty the structure of what I’m working on from my head just so I can remember everything needed to do something somewhat trivial like calculate the overlapping volume of two tetrahedrons. Of course I could solve it myself but just reading through the suggestion it offers and getting back to solving the real task is so much nicer.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

barsquid, 1 month ago

The last time I saw someone talk about using the right LLM tool for the job, they were describing turning two minutes of writing a simple map/reduce into one minute of reading enough to confirm the generated one worked. I think I’ll pass on that.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

linearchaos, 1 month ago

confirm the generated one worked. I think I’ll pass on tha

LLM wasn’t the right tool for the job, so search engine companies made their search engines suck so bad that it was an acceptable replacement.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

NuXCOM_90Percent, 1 month ago

Honestly? I think search engines are actually the best use for LLMs. We just need them to be “explainable” and actually cite things.

Even going back to the AOL days, Ask Jeeves was awesome and a lot of us STILL write our google queries in question form when we aren’t looking for a specific factoid. And LLMs are awesome for parsing those semi-rambling queries like “I am thinking of a book. It was maybe in the early 00s? It was about a former fighter pilot turned ship captain leading the first FTL expedition and he found aliens and it ended with him and humanity fighting off an alien invasion on Earth” and can build on queries to drill down until you have the answer (Evan Currie’s Odyssey One, by the way).

Combine that with citations of what page(s) the information was pulled from and you have a PERFECT search engine.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

notabot, 1 month ago

That may be your perfect search engine, I jyst want proper boolean operators on a sesrch engine that doesn’t think it knows what I want better than I do, and doesn’t pack the results out with pages that don’t match all the criteria just for the sake of it. The sort of thing you described would be anathema to me, as I suspect my preferred option may be to you.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

FaceDeer, 1 month ago

You're describing Bing Chat.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

NuXCOM_90Percent, 1 month ago

And google gemini (?) and kagi’s LLM and all the other ones.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Grandwolf319, 1 month ago

So my company said they might use it to improve confluence search, I was like fuck yeah! Finally a good use.

But to be fair, that’s mostly because confluence search sucks to begin with.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

linearchaos, 1 month ago

They are VERY VERY good at search engine work with a few caveats that we’ll eventually nail. The problem is, they’re WAY to expensive for that purpose. Single queries take tons of compute and power. Constant training on new data takes boatloads of power.

They’re the opposite of efficient; eventually, they’ll have to start charging you a subscription to search with them to stay in business.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Grandwolf319, 1 month ago

Yeah, every time someone says how useful they find LLM for code I just assume they are doing the most basic shit (so far it’s been true).

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

JDubbleu, 1 month ago

That’s a 50% time reduction for the same output which sounds great to me.

I’d much rather let an LLM do the menial shit with my validation while I focus on larger problems such as system and API design, or creating rollback plans for major upgrades instead of expending mental energy writing something that has been written a thousand times. They’re not gonna rewrite your entire codebase, but they’re incredibly useful for the small stuff.

I’m not even particularly into LLMs, and they’re definitely not gonna change the world in the way big tech would like you to believe. However, to deny their usefulness is silly.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

barsquid, 1 month ago

It’s not a consistent 50%, it’s 50% off one task that’s so simple it takes two minutes. I’m not doing enough of that where shaving off minutes is helpful. Maybe other people are writing way more boilerplate than I am or something.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

JDubbleu, 1 month ago

Those little things add up though, and it’s not just good at boilerplate. Also just having a more intelligent context-aware auto complete itself I’ve found to be super valuable.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

sramder, 1 month ago

[…]will only take a few hallucinations before no one trusts LLMs to write code or give advice

Because none of us have ever blindly pasted some code we got off google and crossed our fingers ;-)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Seasm0ke, 1 month ago

Split segment of data without pii to staging database, test pasted script, completely rewrite script over the next three hours.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Hackerman_uwu, 1 month ago

When you paste that code you do it in your private IDE, in a dev environment and you test it thoroughly before handing it off to the next person to test before it goes to production.

Hitting up ChatPPT for the answer to a question that you then vomit out in a meeting as if it’s knowledge is totally different.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

sramder, 1 month ago

Which is why I used the former as an example and not the latter.

I’m not trying to make a general case for AI generated code here… just poking fun at the notion that a few errors will put people off using it.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

avidamoeba, 1 month ago

It’s way easier to figure that out than check ChatGPT hallucinations. There’s usually someone saying why a response in SO is wrong, either in another response or a comment. You can filter most of the garbage right at that point, without having to put it in your codebase and discover that the hard way. You get none of that information with ChatGPT. The data spat out is not equivalent.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

deweydecibel, 1 month ago

That’s an important point, and and it ties into the way ChatGPT and other LLMs take advantage of a flaw in the human brain:

Because it impersonates a human, people are more inherently willing to trust it. To think it’s “smart”. It’s dangerous how people who don’t know any better (and many people that do know better) will defer to it, consciously or unconsciously, as an authority and never second guess it.

And the fact it’s a one on one conversation, no comment sections, no one else looking at the responses to call them out as bullshit, the user just won’t second guess it.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

KeenFlame, 1 month ago

Your thinking is extremely black and white. Many many, probably most actually, second guess chat bot responses.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

gravitas_deficiency, 1 month ago

Think about how dumb the average person is.

Now, think about the fact that half of the population is dumber than that.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

NuXCOM_90Percent, 1 month ago

We already have those near constantly. And we still keep asking queries.

People assume that LLMs need to be ready to replace a principle engineer or a doctor or lawyer with decades of experience.

This is already at the point where we can replace an intern or one of the less good junior engineers. Because anyone who has done code review or has had to do rounds with medical interns know… they are idiots who need people to check their work constantly. An LLM making up some functions because they saw it in stack overflow but never tested is not at all different than a hotshot intern who copied some code from stack overflow and never tested it.

Except one costs a lot less…

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

LucidNightmare, 1 month ago

So, the whole point of learning is to ask questions from people who know more than you, so that you can gain the knowledge you need to succeed…

So… if you try to use these LLMs to replace parts of sectors, where there need to be people that can work their way to the next tier as they learn more and get better at their respective sectors, you do realize that eventually there will no longer be people that can move up their respective tier/position, because people like you said “Fuck ‘em, all in on this stupid LLM bullshit!” So now there are no more doctors, or real programmers, because people like you thought it would just be the GREATEST idea to replace humans with fucking LLMs.

You do see that, right?

Calling people fucking stupid, because they are learning, is actually pretty fucking stupid.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

NuXCOM_90Percent, 1 month ago (edited 1 month ago)

Where did I say “Fuck 'em, all in on this stupid LLM bullshit!”?

But yes, there is a massive labor issue coming. That is why I am such a proponent of Universal Basic Income because there are not going to be enough jobs out there.

But as for training up the interns: Back in the day, do you know what “interns” did? And by “interns” I mean women because sexism but roll with me. Printing out and sorting punch cards. Compilers and general technical advances got rid of those jobs and pushed up where the “charlie work” goes.

These days? There are good internships/junior positions and bad ones. A good one actually teaches skills and encourages the worker to contribute. A bad one has them do the mindless grunt work that nobody else wants to. LLMs get rid of the latter.

And… I actually think that is good for the overall health of workers, if not the number (again, UBI). Because if someone can’t be trusted to write meaningful code without copying it off the internet and not even updating variable names? I don’t want to work with them. I spend too much of my workday babysitting those morons who are just here there to get some work experience so they can con their way into a different role and be someone else’s problem.

And experience will be gained the way it is increasingly being gained. Working on (generally open source) projects and interviewing for competitive internships where the idea is to take relatively low cost workers and have them work on a low ROI task that is actually interesting. It is better for the intern because they learn actual development and collaboration skills. And it is better for the staff because it is a way to let people work on the stuff they actually want to do without the massive investment of a few hundred hours of a Senior Engineer’s time.

And… there will be a lot fewer of those roles. Just like there were a lot fewer roles for artists as animation tools stopped requiring every single cell of animation to be hand drawn. And that is why we need to decouple life from work through UBI.

But also? If we have less internships that consist of “okay. good job. thanks for that. Next time can you at least try and compile your code? or pay attention to the squiggly red lines in your IDE? or listen to the person telling you that is wrong?”? Then we have better workers and better junior developers who can actually do more meaningful work. And we’ll actually need to update the interviewing system to not just be “did you memorize this book of questions from Amazon?” and we’ll have fewer “hot hires” who surprise everyone by being able to breath unassisted but have a very high salary because they worked for facebook.

Because, and here is the thing: LLMs are already as good, if not better than, an intern or junior engineer. And the companies that spend money on training up interns aren’t going to be rewarded. Under capitalism, there is no reason to “take one for the team” so that your competition can benefit.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

assassin_aragorn, 1 month ago

This is already at the point where we can replace an intern or one of the less good junior engineers. Because anyone who has done code review or has had to do rounds with medical interns know… they are idiots who need people to check their work constantly.

Do so at your own peril. Because the thing is, a person will learn from their mistakes and grow in knowledge and experience over time. An LLM is unlikely to do the same in a professional environment for two big reasons:

The company using the LLM would have to send data back to the creator of the LLM. This means their proprietary work could be at risk. The AI company could scoop them, or a data leak would be disastrous.

Alternatively, the LLM could self-learn and be solely in house without any external data connections. A company with an LLM will never go for this, because it would mean their model is improving and developing out of their control. Their customized version may end up being better than their the LLM company’s future releases. Or, something might go terribly wrong with the model while it learns and adapts. If the LLM company isn’t held legally liable, they’re still going to lose that business going forward.

On top of that, you need your inexperienced noobs to one day become the ones checking the output of an LLM. They can’t do that unless they get experience doing the work. Companies already have proprietary models that just require the right inputs and pressing a button. Engineers are still hired though to interpret the results, know what inputs are the right ones, and understand how the model works.

A company that tries replacing them with LLMs is going to lose in the long run to competitors.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

NuXCOM_90Percent, 1 month ago

Actually, nvidia recently announced RAG (Retrieval-Augmented Generation). Basically the idea is that you take an “off the shelf” LLM and then feed your local instance sensitive corporate data. It can then use that information in its responses.

So you really are “teaching” it every time you do a code review of the AI’s merge request and say “Well… that function doesn’t exist” or “you didn’t use useful variable names” and so forth. Which… is a lot more than I can say about a lot of even senior or principle engineers I have worked with over the years who are very much making mistakes that would get an intern assigned to sorting crayons.

Which, again, gets back to the idea of having less busywork. Less grunt work. Less charlie work. Instead, focus on developers who can actually contribute to a team and design meetings.

And the model I learned early in my career that I bring to every firm is to have interns be a reward for talented engineers and not a punishment for people who weren’t paying attention in Nose Goes. Teaching a kid to write a bunch of utility functions does nothing they didn’t learn (or not learn) in undergrad but it is a necessary evil… that an AI can do.

Instead, the people who are good at their jobs and contributing to the overall product? They probably have ideas they want to work on but don’t have the cycles to flesh out. That is where interns come into play. They work with those devs and other staff and learn what it means to actually be part of a team. They get to work on really cool projects and their mentors get to ALSO work on really cool projects but maybe focus more on the REALLY interesting parts and less on the specific implementation.

And result is that your interns are now actually developers who are worth a damn.

Also: One of the most important things to teach a kid is that they owe the company nothing. If they aren’t getting the raise they feel they deserve then they need to be updating their linkedin and interviewing elsewhere. That is good for the worker. And that also means that the companies that spend a lot of money training up grunts? They will lose them to the companies who are desperate for people who can lead projects and contribute to designs but haven’t been wasting money on writing unit tests.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

NaibofTabr, 1 month ago (edited 1 month ago)

This is already at the point where we can replace an intern or one of the less good junior engineers.

This is a bad thing.

Not just because it will put the people you’re talking about out of work in the short term, but because it will prevent the next generation of developers from getting that low-level experience. They’re not “idiots”, they’re inexperienced. They need to get experience. They won’t if they’re replaced by automation.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ipkpjersi, 1 month ago (edited 1 month ago)

First a nearly unprecedented world-wide pandemic followed almost immediately by record-breaking layoffs then AI taking over the world, man it is really not a good time to start out as a newer developer. I feel so fortunate that I started working full-time as a developer nearly a decade ago.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

morrowind, 1 month ago

Dude the pandemic was amazing for devs, tech companies hiring like mad, really easy to get your foot in the door. Now, between all the layoffs and AI it is hellish

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ipkpjersi, 1 month ago

I think it depends on where you live. Hiring didn’t go crazy where I live, but the layoffs afterwards sure did.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

kibiz0r, 1 month ago

The quality really doesn’t matter.

If they manage to strip any concept of authenticity, ownership or obligation from the entirety of human output and stick it behind a paywall, that’s pretty much the whole ball game.

If we decide later that this is actually a really bullshit deal – that they get everything for free and then sell it back to us – then they’ll surely get some sort of grandfather clause because “Whoops, we already did it!”

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

antihumanitarian, 1 month ago

Have you tried recent models? They’re not perfect no, but they can usually get you most of the way there if not all the way. If you know how to structure the problem and prompt, granted.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

capital, 1 month ago (edited 1 month ago)

People keep saying this but it’s just wrong.

Maybe I haven’t tried the language you have but it’s pretty damn good at code.

Granted, whatever it puts out needs to be tested and possibly edited but that’s the same thing we had to do with Stack Overflow answers.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

VirtualOdour, 1 month ago

I use it all the time and it’s brilliant when you put in the basic effort to learn how to use it effectively.

It’s allowing me and other open source devs to increase the scope and speed of our contributions, just talking through problems is invaluable. Greedy selfish people wanting to destroy things that help so many is exactly the rolling coal mentality - fuck everyone else I don’t want the world to change around me! Makes me so despondent about the future of humanity.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

CeeBee, 1 month ago

I’ve tried a lot of scenarios and languages with various LLMs. The biggest takeaway I have is that AI can get you started on something or help you solve some issues. I’ve generally found that anything beyond a block or two of code becomes useless. The more it generates the more weirdness starts popping up, or it outright hallucinates.

For example, today I used an LLM to help me tighten up an incredibly verbose bit of code. Today was just not my day and I knew there was a cleaner way of doing it, but it just wasn’t coming to me. A quick “make this cleaner: <code>” and I was back to the rest of the code.

This is what LLMs are currently good for. They are just another tool like tab completion or code linting

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Spedwell, 1 month ago

We should already be at that point. We have already seen LLMs’ potential to inadvertently backdoor your code and to inadvertently help you violate copyright law (I guess we do need to wait to see what the courts rule, but I’ll be rooting for the open-source authors).

If you use LLMs in your professional work, you’re crazy. I would never be comfortably opening myself up to the legal and security liabilities of AI tools.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Amanduh, 1 month ago

Yeah but if you’re not feeding it protected code and just asking simple questions for libraries etc then it’s good

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Grandwolf319, 1 month ago

I feel like it had to cause an actual disaster with assets getting destroyed to become part of common knowledge (like the challenger shuttle or something).

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Cubes, 1 month ago

If you use LLMs in your professional work, you’re crazy

Eh, we use copilot at work and it can be pretty helpful. You should always check and understand any code you commit to any project, so if you just blindly paste flawed code (like with stack overflow,) that’s kind of on you for not understanding what you’re doing.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Spedwell, 1 month ago

The issue on the copyright front is the same kind of professional standards and professional ethics that should stop you from just outright copying open-source code into your application. It may be very small portions of code, and you may never get caught, but you simply don’t do that. If you wouldn’t steal a function from a copyleft open-source project, you wouldn’t use that function when copilot suggests it. Idk if copilot has added license tracing yet (been a while since I used it), but absent that feature you are entirely blind to the extent which it’s output is infringing on licenses. That’s huge legal liability to your employer, and an ethical coinflip.

Regarding understanding of code, you’re right. You have to own what you submit into the codebase.

The drawback/risks of using LLMs or copilot are more to do with the fact it generates the likely code, which means it’s statistically biased to generate whatever common and unnoticeable bugged logic exists in the average github repo it trained on. It will at some point give you code you read and say “yep, looks right to me” and then actually has a subtle buffer overflow issue, or actually fails in an edge case, because in a way that is just unnoticeable enough.

And you can make the argument that it’s your responsibility to find that (it is). But I’ve seen some examples thrown around on twitter of just slightly bugged loops; I’ve seen examples of it replicated known vulnerabilities; and we have that package name fiasco in the that first article above.

If I ask myself would I definitely have caught that? the answer is only a maybe. If it replicates a vulnerability that existed in open-source code for years before it was noticed, do you really trust yourself to identify that the moment copilot suggests it to you?

I guess it all depends on stakes too. If you’re generating buggy JavaScript who cares.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Hypx, 1 month ago

Eventually, we will need a fediverse version of StackOverflow, Quora, etc.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

avidamoeba, 1 month ago

We already have the SO data. We could populate such a tool with it and start from there.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

thfi, 1 month ago

Those would be harvested to train LLMs even without asking first. 😐

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

sramder, 1 month ago

At this point I’m assuming most if not all of these content deals are essentially retroactive. They already scrapped the content and found it useful enough to try and secure future use, or at least exclude competitors.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rickyrigatoni, 1 month ago

They scraped the content, liked the results, and are only making these deals because it’s cheaper than getting sued.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

AeroLemming, 1 month ago

Can they really sue (with a chance of winning) if you scrape content that’s submitted by users? That’s insane.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Rolando, 1 month ago

But users and instances would be able to state that they do not want their content commercialized. On StackOverflow you have no control over that.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ArbitraryValue, 1 month ago

You can state what you don’t want, but no one will be paying attention. Except maybe the LLM reading your posts…

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pivot_root, 1 month ago

Yup. Laws are only suggestions until you get caught.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ArbitraryValue, 1 month ago

I suspect it isn’t even illegal, but I’m not an expert.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

mox, 1 month ago

Assuming the federated version allowed contributor-chosen licenses (similar to GitHub), any harvesting in violation of the license would be subject to legal action.

Contrast that with Stack Exchange, where I assume the terms dictated by Stack Exchange deprive contributors of recourse.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

danc4498, 1 month ago

I’d rather the harvesting be open to all than only the company hosting it.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chameleon, 1 month ago

SO already was. Not even harvested as much as handed to them. Periodic data dumps and a general forced commitment to open information were a big part of the reason they won out over other sites that used to compete with them. SO most likely wouldn't have existed if Experts Exchange didn't paywall their entire site.

As with everything else, AI companies believe their training data operates under fair use, so they will discard the CC-SA-4.0 license requirements regardless of whether this deal exists. (And if a court ever finds it's not fair use, they are so many layers of fucked that this situation won't even register.)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

linearchaos, 1 month ago

Honestly? I’m down with that. And when the LLM’s end up pricing themselves out of usefulness, we’ll still have the fediverse version. Having free sites on the net with solid crowd-sourced information is never a bad thing even if other people pick up the data and use it.

It’s when private sites like Duolingo and Reddit crowd source the information and then slowly crank down the free aspect that we have the problems.

The Ad sponsored web model is not viable forever.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bort, 1 month ago

The Ad sponsored web model is not viable forever.

a thousand times this

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

thejml, 1 month ago

Not fediverse, but open-source and community run: codidact.com

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

avidamoeba, 1 month ago

Oh this looks decent. British non-profit, I like it. Registering.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

linearchaos, 1 month ago

Smells too much like duo-lingo. Here, everyone jump in and answers all the questions. 5 years later, ohh look at this gold mine of community data we own…

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

residentmarchant, 1 month ago

This was actually the whole original point of Duolingo. The founder previously created Recaptcha to crowd source machine vision of scanned books.

His whole thing is crowd sourcing difficult tasks that machines struggle with by providing some sort of reason to do it (prevent spam at first and learn a language now)

From what I understand Duolingo just got too popular and the subscription service they offer made them enough money to be happy with.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

linearchaos, 1 month ago

Duolingo has been systematically enshittifying the free/ad supported service. Now every time you fart, you get a big unskippable ad trying to get you to subscribe to their service for free for 14 days without telling you the price. They took all that crowdsourced data that weren’t going to profit off of and are making the app a miserable experience without it.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

linearchaos, 1 month ago

We needed it a few years ago.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

NoIWontPickAName, 1 month ago

Can we pass on quora?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Syrc, 1 month ago

Hey, early Yahoo answers was very useful. A de-shittified, federated, stripped down to the bare questions-answers network could be neat.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Scrollone, 1 month ago

10 POINTS!!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rickyrigatoni, 1 month ago

Federated yahoo answers.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

brbposting, 1 month ago

how is feddi formed

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

HowManyNimons, 1 month ago

Arguably, they need to do way instain mother> who kill thier babbys. becuse these babby cant frigth back?

It’s important to remember that it was on the news this mroing a mother in ar who had kill her three kids.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

NoIWontPickAName, 1 month ago

Too much, can’t figure it out

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

BraveLittleToaster, 1 month ago

Everything you write on here is public. There’s nothing stopping anyone from using that data for training

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

VirtualOdour, 1 month ago

Yeah but didn’t you see the sovereign citizens who think licenses are magic posting giant copyright notices after their posts? Lol

It’s so childish, ai tools will help billions of the poorest people access life saving knowledge and services, help open source devs like myself create tools that free people from the clutches of capitalism, but they like living in a world of inequity because their generational wealth earned from centuries of exploitation of the impoverished allows them a better education, better healthcare, and better living standards than the billions of impoverished people on the planet so they’ll fight to maintain their privilege even if they’re fighting against their own life getting better too. The most pathetic thing is they pretend to be fighting a moral crusade, as if using the answers they freely posted and never expected anything in return for is a real injustice!

And yes I know people are going to pretend that they think tech bros won’t allow poor people to use their tech and they base this on assuming how everything always works will suddenly just flip Into reverse at some point or something? Like how mobile phones are only for rich people and only rich people can sell via the internet and only rich people can start a YouTube channel…

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Churbleyimyam, 1 month ago

At the end of the day, this is just yet another example of how capitalism is an extractive system. Unprotected resources are used not for the benefit of all but to increase and entrench the imbalance of assets. This is why they are so keen on DRM and copyright and why they destroy the environment and social cohesion. The thing is, people want to help each other; not for profit but because we have a natural and healthy imperative to do the most good.

There is a difference between giving someone a present and then them giving it to another person, and giving someone a present and then them selling it. One is kind and helpful and the other is disgusting and produces inequality.

If you’re gonna use something for free then make the product of it free too.

An idea for the fediverse and beyond: maybe we should be setting up instances with copyleft licences for all content posted to them. I actually don’t mind if you wanna use my comments to make an LLM. It could be useful. But give me (and all the other people who contributed to it) the LLM for free, like we gave it to you. And let us use it for our benefit, not just yours.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jnk, 1 month ago

Agreed on that last part, making that the default would be a great solution. I could also use a signature in comments, like that guy who always puts the “Commercial AI thingy” but automatically.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

CancerMancer, 1 month ago

An idea for the fediverse and beyond: maybe we should be setting up instances with copyleft licences for all content posted to them. I actually don’t mind if you wanna use my comments to make an LLM. It could be useful. But give me (and all the other people who contributed to it) the LLM for free, like we gave it to you. And let us use it for our benefit, not just yours.

This seems like a very fair and reasonable way to deal with the issue.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

madis, 1 month ago

Well, supposedly people can use it without paying and without account, though I cannot confirm the last part in the official site.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

HelloHotel, 1 month ago

Open access != Copyleft, but its a decent start.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Churbleyimyam, 1 month ago

Can you explain?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

HelloHotel, 1 month ago

Copyleft lisenses are anti-copywrite, copywrite lisenses. They guarantee any random person the right to use and (usually) modify and (usually) distribute the work (art, program, etc.) with some noteworthy terms and conditions. Open access is where they provide a good or service for free but are not legally required to do so.

I bitch about it not being open sourced like llama2.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Churbleyimyam, 1 month ago

I think you still have to have an account (last time I used it anyway), but you’re right, there is a tier you don’t have to pay any money for. It’s just an email address but whatever. You can use it via their website but afaik they haven’t released a free model based on the data they’ve scraped off us, so you can’t host it on your own hardware and properly do what you want with it. I have heard though that commercial websites were/are using ChatGPT bots for customer service and you can easily use the customer service chatbots on their website to do other random stuff like writing bash scripts or making yo mama jokes.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

nasduia, 1 month ago

Why does OpenAI want 10 year old answers about using jQuery whenever anyone posts a JavaScript question, followed by aggressive policing of what is and isn’t acceptable to re-ask as technology moves on?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

nialv7, 1 month ago

They probably aren’t looking for the factual information, perhaps more the logical thinking abilities.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

btaf45, 1 month ago

jQuery is still an excellent Javascript library

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jj4211, 1 month ago

Nice try, ChatGPT

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

btaf45, 1 month ago

jQuery will be still be around after the latest Javascript framework of the month is long gone.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jj4211, 1 month ago

Maybe, but I wouldn’t say it’s really excellent.

It was basically helping people deal with ancient browsers (particularly IE6) and a javascript runtime bereft of convenience features, at a cost of some syntactic awkwardness and performance.

If you are targeting ES2020 and above, as is widely considered a reasonable requirement, you pretty much have the stuff that jQuery brings to the table, but built in without additional download and without an abstraction that costs some cycles.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Simon, 1 month ago (edited 1 month ago)

Why?? Please make this make sense. Having AI to help with coding is ideal and the greatest immediate use case probably. The web is an open resource. Why die on this stupid hill instead of advocating for a privacy argument that actually matters?

Edit: Okay got it. Hinder significant human progress because a company I don’t like might make some more money from something I said in public, which has been a thing literally forever. You guys really lack a lot of life skills about how the world really works huh?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Thorny_Insight, 1 month ago

Hating on everything AI is trendy nowdays. Most of these people can’t give you any coherent explanation for why. They just adopt the attitude of people around them who also don’t know why.

I believe the general reasoning is something along the lines of not wanting bad corporations to profit from their content for free. So it’s just a matter of principle for the most part. Perhaps we need to wait for someone to train LLM on the freely available to everyone data on Lemmy and then we can interview it to see what’s up.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Eyck_of_denesle, 1 month ago

Mega co operations like Microsoft, Google are evil. Very easy explanation. Even if it was a good open source company scraping the data to train ai models, people should be free to delete the datta they input. It’s pretty simple to understand.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

iltg, 1 month ago

humanity progress is spending cities worth of electricity and water to ask copilot how to use a library and have it lie back to you in natural language? please make this make sense

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Simon, 1 month ago

??? So why don’t we get better at making energy than get scared about using a renewable resource. Fuck it let’s just go back to the printing press.

Amazing to me how stuff like this gets upvoted on a supposedly progressive platform.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Cethin, 1 month ago

Were in a capitalist system and these are for-profit companies, right? What do you think their goal is. It isn’t to help you. It’s to increase profits. That will probably lead to massive amounts of jobs replaced with AI and we will get nothing for giving them the data to train on. It’s purely parasitic. You should not advocate for it.

If it’s open and not-for-profit, it can maybe do good, but there’s no way this will.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

dependencyinjection, 1 month ago

Why can’t they increase profits, by you know, making the product better.

Do they have to make things shitter to save money and drive away people thus having to make it more shitter.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Cethin, 1 month ago

If they make it better that may increase profits temporarily, as they draw customers away from competitors. Once you don’t have any competitors then the only way to increase profits is to either decrease expenses or increase revenue. Increasing revenue is limited if you’re already sucking everything you can.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

dependencyinjection, 1 month ago

And is it wrong to stop at a certain amount of profit.

Why they always want more. I ain’t that greedy.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Cethin, 1 month ago

To us? No, it isn’t wrong. To them? Absolutely. You don’t becoming a billionaire by thinking you can have enough. You don’t dominate a market while thinking you don’t need more.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

VirtualOdour, 1 month ago

Meta and Google have done more for open source ai than anyone else, I think a lot of antis don’t really understand how computer science works so you imagine it’s like them collecting up physical iron and taking it into a secret room never to be seen again.

The actual tools and math is what’s important, research on best methods is complex and slow but so far all these developments are being written up in papers which anyone can use to learn from - if people on the left weren’t so performative and lazy we could have our own ai too

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Cethin, 1 month ago

I studied computer science in university. I know how computer science works.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

TheObviousSolution, 1 month ago

Because being able to delete your data from social networks you no longer wish to participate in or that have banned you, as long as they specifically haven’t paid you for the your contributions, is a privacy argument that actually matters, regardless and independent of AI.

In regards to AI, the problem is not with AI in general but with proprietary for-profit AI getting trained with open resources, even those with underlying license agreements that prevent that information being monetized.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Simon, 1 month ago

Now this is something I can get behind. But I was talking about the decision to retaliate in the first place.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Allero, 1 month ago

Because none of the big companies listen to the privacy argument. Or any argument, really.

AI in itself is good, amazing, even.

I have no issue with open-source, ideally GPL- or similarly licensed AI models trained on Internet data.

But involuntarily participating in training closed-source corporate AI’s…no, thanks. That shit should go to the hellhole it was born in, and we should do our best to destroy it, not advocate for it.

If you care about the future of AI, OpenAI should long be on your enemy list. They expropriated an open model, they were hypocritical enough to keep “open” in the name, and then they essentially sold themselves to Microsoft. That’s not the AI future we should want.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Facebones, 1 month ago

Good to know as capitalism flounders this modern Red Scare extends into tech.

You’re explicitly ignoring everything everyone is saying just cause you want to call everyone technocommies lmfao.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Simon, 1 month ago

When you say those words do you imagine normal people reading this and not laughing

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Facebones, 1 month ago

When you say those words do you imagine yourself as normal people?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

VirtualOdour, 1 month ago

Why do people roll coal? Why do vandalize electric car chargers? Why do people tie ropes across bike lanes?

Because a changing world is scary and people lash out at new things.

The coal rollers think they’re fighting a vallient fight against evil corporations too, they invested their effort into being a car guy and it doesn’t feel fair that things are changing so they want to hurt people benefitting from the new tech.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Simon, 1 month ago

The deeper I get into this platform the more I realize the guise of being ‘progressive, left, privacy-conscious, tech inclined’ is literally the opposite.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pseudo, 1 month ago

Angry users claim they are enabled to delete their own content from the site through the “right to forget,” a common name for a legal right most effectively codified into law through the EU’s General Data Protection Regulation (GDPR). Among other things, the act protects the ability of the consumer to delete their own data from a website, and to have data about them removed upon request. However, Stack Overflow’s Terms of Service contains a clause carving out Stack Overflow’s irrevocable ownership of all content subscribers provide to the site

It reality irritates me when ToS simply state they will do against the law.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hikaru755, 1 month ago

It’s not quite that simple, though. GDPR is only concerned with personally identifiable information. Answers and comments on SO rarely contain that kind of information as long as you delete the username on them, so it’s not technically against GDPR if you keep the contents.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

windpunch, 1 month ago

You could argue that people can be identified by their writing style. I have no idea how far you’d get with that though.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

FJW, 1 month ago

Frankly I don’t see any way whatsoever that this would fly, and that’s a good thing!

Imagine what it would mean for software-development if one angry dev could request the deletion of all their contributions at a moments notice by pointing to a right to be forgotten. Documentation is really not meaningfully different from that.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Add comment