@PDFuego@lemmy.world avatar

That’s been happening for ages. I’m sure if you check the profiles you’ll find other posts with all the same bots commenting. A lot of lazier ones wait exactly a year to repost, and it’s pretty obvious in subs for something like a live service game where they’ll be reposting complaints that are way out of date. One in the Monster Hunter sub reposted a trailer for Iceborne which had been out for 3 years by that point.


My favorite reposts were the ones that were only like 6 months later, so they’re talking about christmas or r/place as if its that time of year when its the total opposite.


These are probably the bots that will be paid for creating content too. lol




Yes, I was surprised as well


Is there a Reddit submission with this screenshot? I wanna read the reactions.

somas avatar


If you want to avoid Reddit tracking altogether a redlib instance will let you do that https://libreddit.projectsegfau.lt/r/conspiracy/comments/170e8dp/reddit_comments_are_full_of_bots_reupload/


Thanks for the link! I usually just use the Reddit link, I’ve had alternative viewers shut down in the past (e.g. Invidious instances for Youtube) and then people can’t access the content anymore

somas avatar


Yes that’s a concern here as well but it’s pretty easy to run your own instance in Docker or whatever


/r/conspiracy of all places, yuck...


Yeah, not the best place indeed


It used to be cheeky fun. Now it’s Alex jones and alt-right.


Oh man, I would browse while on the shitter at work. It used to be one of my OGs. A lot of tinfoil. And you’d get the deep dives that didn’t feel politically motivated (compared to today).

Then, the Trumpeting.

Like everything else not stapled down circa 2016, it was an easy target for the Russian firehose of falsehood: an entire community of people wanting to believe some alternative bullshit.


Is perfect if you are targeting easily gullible massively ignorant people.

@Carighan@lemmy.world avatar

That’s why you ignore all comments from usernames that are like the default ones. Has been that way for a long time, tbh.


My understanding of how this works is that that left one is real accounts making real comments, at least in the majority.

Then when the link gets reposted, either by a bot or naturally, potentially depending on the title, the bots scrape the old comments and post them.

It’s content farming. And Reddit is probably okay with this.

livus avatar

Reddit is going to poison LLMs sooner than I thought.


Reddit probably omits bot accounts when it sells its data to AI companies

livus avatar

Doubt it, they are interwoven into almost any conversation with more than 70 comments.


If you have access to the entire Reddit comment corpus it’s trivial to see which users are only reposting carbon copies of content that appears elsewhere on the site


It’s probably not as easy as you imagine for reddit to identify and cleanse all bot content.


Look at the picture above - this is trivially easy. We are talking about identifying repost bots, not seeing if users pass/fail the Turing test

If 99% of a user’s posts can be found elsewhere, word for word, with the same parent comment, you are looking at a repost bot


That’s easy in an isolated case like this, but the reality of the entire reddit comment base is much more complex.

livus avatar

Of course it's not. Nor do they want to.

I think the person you're talking to thinks all bots are like the easy ones in this screenshot.

livus, (edited )
livus avatar

The low level bots in OPs screenshot, sure, because it's identical. Not the rest.

I used to hunt bots on reddit for a hobby and give the results to Bot Defense.

Some of them use rewrites of comments with key words or phrases changed to other words or phrases from a thesaurus to avoid detection. Some of them combine elements from 2 comments to avoid detection. Some of them post generic comments like 💯. Doubtless there are some using AI rewrites of comments now.

My thought process is if generic bots have been allowed to go so rampant they fill entire threads that's an indication of how bad the more sophisticated bot problem has become.

And I think @phdepressed is right, no one at reddit is going to hunt these sophisticated bots because they inflate numbers. Part of killing the API use was to kill bot detection after all.


Reddit has way more data than you would have been exposed to via the API though - they can look at things like user ARN (is it coming from a datacenter), whether they were using a VPN, they track things like scroll position, cursor movements, read time before posting a comment, how long it takes to type that comment, etc.

no one at reddit is going to hunt these sophisticated bots because they inflate numbers

You are conflating “don’t care about bots” with “don’t care about showing bot generated content to users”. If the latter increases activity and engagement there is no reason to put a stop to it, however, when it comes to building predictive models, A/B testing, and other internal decisions they have a vested financial interest in making sure they are focusing on organic users - how humans interact with humans and/or bots is meaningful data, how bots interact with other bots is not


I doubt Reddit is in charge of many of the existing bots on their site.


Reddit has access to its own data - they absolutely know which users are posting unique content and which user’s content is a 100% copy of data that exists elsewhere on their own platform


I know they could be I’m just not sure they’re that competent. These bots often aren’t single user or just copy paste either, there’s usually some effort to mix it up or change wording slightly. Reddits internal search function is infamously shit but they “know” which users are unlabeled bots with some effort put behind them?


I know everyone here likes to circle jerk over “le Reddit so incompetent” but at the end of the day they are a (multi) billion dollar company and it’s willfully ignorant to infer that there isn’t a single engineer at the company who knows how to measure string similarity between two comment trees (hint: import difflib in python)

icydefiance, (edited )
  1. To compare every comment on reddit to every other comment in reddit’s entire history would require an index, and if you want to find similar comments instead of exact matches, it becomes a lot harder to do that efficiently. ElasticSearch might be able to do it, but then you need to duplicate all of that data in a separate database and keep it in sync with your main database without affecting performance too much when people are leaving new comments, and that would probably be expensive.
  2. Comparing combinations of comments is probably impossible. Reddit has a massive number of comments to begin with, and the number of possible subtrees of those comments would just be absurd. If you only care about comparing entire threads and not subtrees, then this doesn’t apply, but I don’t know how useful that will be.
  3. Programmers just do what they’re told. If the managers don’t care about something, the programmers won’t work on it.

To compare every comment on reddit to every other comment in reddit’s entire history would require an index

You think in Reddit’s 20 year history no one has thought of indexing comments for data science workloads? A cursory glance at their engineering blog indicates they perform much more computationally demanding tasks on comment data already for purposes of content filtering

you need to duplicate all of that data in a separate database and keep it in sync with your main database without affecting performance too much

Analytics workflows are never run on the production database, always on read replicas which are taken asynchronously and built from the transaction logs so as not to affect production database read/write performance

Programmers just do what they’re told. If the managers don’t care about something, the programmers won’t work on it.

Reddit’s entire monetization strategy is collecting user data and selling it to advertisers - It’s incredibly naive to think that they don’t have a vested interest in identifying organic engagement


You think in Reddit’s 20 year history no one has thought of indexing comments for data science workloads?

I’m sure they have, but an index doesn’t have anything to do with the python library you mentioned.

Analytics workflows are never run on the production database, always on read replicas

Sure, either that or aggregating live streams of data, but either way it doesn’t have anything to do with ElasticSearch.

It’s still totally possible to sync things to ElasticSearch in a way that won’t affect performance on the production servers, but I’m just saying it’s not entirely trivial, especially at the scale reddit operates at, and there’s a cost for those extra servers and storage to consider as well.

It’s hard for us to say if that math works out.

It’s incredibly naive to think that they don’t have a vested interest in identifying organic engagement

You would think, but you could say the same about Facebook and I know from experience that they don’t give a fuck about bots. If anything they actually like the bots because it looks like they have more users.


I figure it’s their absolute last priority. They might know rough bot #s, but haven’t built or don’t widely use takedown tools. There’s always an enhancement to deliver, and bots help their engagement metrics.


LMAO while AIs reading training data sets get stuck in infinite loops.

kubica avatar

Basically replaying a thread to make it look like there's activity in the sub.


It’s account farming. They make fake accounts look legitimate so they can use them to influence opinions on the site.

livus avatar

They also use them in groups of 3 to lure people to malicious sites and scam sites. Especially fake merchandise sites.


The right one is the “real” accounts. Notice how the left one is newer and all the accounts have names ending with four digits, except where they aren’t copies from the right.


The list of names at the left creeps me the fuck out.


I saw this exact same style of bot account years ago on Tumblr. They always follow the same naming scheme: one word or two words combined and then a string of 4 digits. I bet if you go to any of their profiles, you’ll find like 4 comments that are all copied from old threads and a bunch of upvotes on completely random subs, possibly even all of them being on other bot accounts’ posts and comments.

The real question is whether they’re being used to fake activity on Reddit, sway public opinion by posting this sort of political slant, or will they later be used to advertise scams and this is just to make them seem legitimate.


I thought the names followed that format because that’s the format reddit used for suggestions when signing up.

I think the accounts are kind of “warmed up” this way to make them harder for reddit to identify as bots when they’re used for vote manipulation.

Like a bot that just voted in /r/politics threads world be easier to identify than one which comments here and there and gets a few upvotes itself.


Why not all of the above? If you have a service, you want to sell it to as many customers as possible.


Very good point.


No, the left one is older and most the names in the right contain four numbers.

What’s going on here?

Maybe op updated the picture?


yeah they did for some reason it seems

@Blaze@reddthat.com avatar

I did, because other people complained in another comment that it was confusing to not have the older thread on the left.

Anyway, it’s pretty obvious which one is which one


Thanks I almost thought I’m delusional


I also thought you were, lmao.



  • Anti_Face_Weapon,

    The left predates the right by 10 months


    Give them some credit. They've finally changed the user name generator to random words instead of Adjective_Noun_####.

    AlteredStateBlob avatar

    Adjective_Noun_#### are default generated by reddit, so they upgraded to their own generator at least it seems.


    They have not, left is the more recent post. The right one could be real and is just recreated by these bots.


    I agree, credit retracted.


    No, I think those comments are just unwitting humans walking into the simulation.


    “It doesn’t look like anything to me.”



    I didn’t believe this when I first heard about it but it’s looking more true everyday


    Reading the Wikipedia it seems quite unlikely, but then again maybe it’s also written by a bot.


    As a human I think the Wikipedia article is correct. I’m not a bot (drinking water right now- bots cannot do this).


    I saw a movie where bots had a kind of food & drink bag inside their belly to correct whatever they put in their mouth so they could emulate biologicals.


    Yeah, even if we’re not quite “there” yet, it feels like we’re at least moving in that direction


    Definitely depends on where you’re going. Certain Hexbear posts are such obvious bot networks, while some niche communities can remember what they wrote more than two comments ago.


    I have a more realistic description of “Dead Internet Theory” that involves no conspiracy theories:

    The Internet is becoming a monoculture, which is killing the vibrant, diverse, resilient, innovative space it used to be. Manifestos about a better way of life, and creative personal websites have been replaced with vapid social status posts in bland bootstrap layouts that double as data collection schemes. Technology that empowers people has been replaced with technology to restrict people. Bots masquerading as people is just the cherry on the sundae, the inevitable outcome of having created such a monoculture, a place where large orchards of content are so easy to pollute. The modern Internet ducking sucks, it has been ruined by people.


    This gets posted all the time, and it’s frustrating that it lacks any nuance.

    It’s just a spooky bedtime story… “imagine if everyone you talk to online is just a bot”

    Yes a lot of online content is generated.

    Yes it’s getting worse.

    Yes there’s lots of bots.

    However… you can choose where you spend your time online, and spend it with friends or likeminded people.

    What I mean to say is, some communities on reddit are “mostly dead”, but you don’t have to go there.


    Dead internet.

    @iso@lemy.lol avatar

    Interesting 🤔 Can you prove that you’re a human?


    Can any of us?


    I can’t get the captchas with the motorcycles, ever. I thought i was human but captcha dont lie


    Captchas are like hips that way.


    They lost so many users they needed the "engagement" numbers for the IPO so they opened the flood gate. Now they are stuck with an issue they can't fix without admitting the fraud.

    octopus_ink, (edited )

    How far does it have to go before investors start to care I wonder? I somehow doubt OP is the only person capable of perceiving and documenting this.


    Where as it is shifting to a front for Gov. Psy Ops just like Xitter, investors don't matter.


    I saw it a lot but didn’t know it was the comments too.

    Should be illegal, this website needs to disappear once and for all.

    I hope someone will create an extension to flag them

    @MiguelX413@pleroma.miguelcr.me avatar

    > this website needs to disappear once and for all

    Based. Reddit dēlendus est.


    I remember when the narwhal used to bacon only at midnight.

    Now the narwhal is forced to bacon continuously.

    This kills the narwhal.




    The pigs fly at midnight, but believe not what they say, for they tell only treacherous lies.


    The narwhal bacons your data at midnight.

    @Gradually_Adjusting@lemmy.world avatar

    Narwhallaire bacongrind moment




    has someone reached out to u/A-Seashell to let them know they’ll never get an answer :’(

    Dark_Arc, (edited )
    @Dark_Arc@social.packetloss.gg avatar

    I’m mildly annoyed the recent thread is on the left not the right, but this is super interesting so thanks for sharing! 🤖

    @Blaze@reddthat.com avatar

    Feel free to edit the image to change the order, I would update the post with the updated version!

    @Blaze@reddthat.com avatar

    Thank you, updated on LW, should federated to other instances as well! lemmy.world/post/14859950

    someguy3, (edited )

    How the fuck is this even possible?

    *Does this have to be done by Reddit itself? That’s the only way I can think of, but I really have no idea how it works.


    Historically bot posts are from people looking to sell accounts


    The only possible benefit to this kind of behavior is creating the impression that there’s more traffic on Reddit than there really is, from which only Reddit benefits.


    Especially around IPO time


    There’s plenty of other reasons to do this. From scammers trying to legitimize accounts to use later to groups trying to sway user opinions on the site. This sort of thing has been going on on plenty of other websites for years. This is the same strategy the porn bots on Tumblr used, and they were so prolific there that they got the Tumblr app removed from Apple’s app store.

    @UnderpantsWeevil@lemmy.world avatar

    Ah yes, its my old friend

    ♖👣 卩ย𝐒𝓼y I𝐍 ⓑίⓞ 🍩👍


    This doesn’t have to be done by Reddit.

    I can create as many accounts as I want, scrape old threads, and then replay the old conversations with my new accounts.

    Makes it appear very much like the accounts are real people. Then I can sell them to troll farms.

    @iterable@sh.itjust.works avatar

    Not just Reddit every website I go to now I see this. Even on official game forums like World of Warcraft. Using to promote content or advertise in a way that tries to be organic.


    My favorite are the YouTube comments saying to follow Jesus or whatever regardless of the actual content of the video. Who is that even for? LOL


    My favorite is the comment I see on 80% of videos: “Upvote if you came here from Tik Tok”


    Clearly, the algorithm thinks you need Jesus.

    @irreticent@lemmy.world avatar

    It has seen your search history and is worried for your soul.

    @iterable@sh.itjust.works avatar

    Most likely those “Mega” Churches. If you post proof or call it out watch yourself get spam reported. I have gotten reported and temp banned when the bots abuse the automated systems. I know a few devs and they are scared that they can’t keep ahead of trying to ID and remove Ai like this.

    @UnderpantsWeevil@lemmy.world avatar

    Have you watched any sporting events recently. Some Christian group is willing to pay millions of dollars for a 30 second “Look at this puppy. Pretty great, right? Jesus. He loves puppies, too” ad spots.

    I have to assume that we’re just dealing with people who have way more money than sense, and this is literally the best they can come up with in terms of evangelism.


    My mechanical keyboard people haven’t really migrated over to Lemmy, so I after I stopped posting to Reddit (I still lurk… sue me) I signed onto a couple of legacy forums. A few months ago, one forum had a poster ask about a sketchy email he got from a vendor asking them to mention their keyboard X number of times, and didn’t even have to be uniformly positive, as long as he didn’t completely shit on them. They needed the visibility. He seemed iffy and I think decided against it, not least of which was that the payment was, IIRC, a free keyboard.

    Not two days later, a veteran poster on the other forum magically mentions this obscure and unremarkable vendor, and while they’re qualified in their praise, they sure spent a lot of time talking about them. I was about to call it out, but then I just thought, “well hell, at least the company’s still using real people as shills. This is life now.”

