vantablack, 2 months ago to random just discovered what miiiiiight be some sorta new fedi scraper indexer thingy? definitely don't take my word for it but, just throwing this out there so someone more technically-inclined can take a look https://fediscanner.info/
just discovered what miiiiiight be some sorta new fedi scraper indexer thingy?
definitely don't take my word for it but, just throwing this out there so someone more technically-inclined can take a look
https://fediscanner.info/
vantablack, 2 months ago #scraper #indexer #maybe
#scraper #indexer #maybe
vantablack, 2 months ago ⚠️ FEDI SCRAPER AND INDEXER ⚠️ okay according to multiple peeps in the replies of the original post, this is indeed in fact a fedi scraper and indexer i found https://fediscanner.info #fediblock #scraper #indexer
okay according to multiple peeps in the replies of the original post, this is indeed in fact a fedi scraper and indexer i found
https://fediscanner.info
#fediblock #scraper #indexer
Taffer, 3 months ago to llm I was going to ask if there’s some robots.txt magic that’ll keep LLM scrapers out. Then I thought of a better idea. Is there a source of text/images that I can toss on there that’ll poison “AI” scrapers? #genai #llm #scraper #robotstxt
I was going to ask if there’s some robots.txt magic that’ll keep LLM scrapers out.
Then I thought of a better idea.
Is there a source of text/images that I can toss on there that’ll poison “AI” scrapers?
#genai #llm #scraper #robotstxt
benlk, 7 months ago to random I have an idea for a 1000+record scraper project, but I'm not sure how to capture the data. CSV or a database? Which database? Any suggestions?
I have an idea for a 1000+record scraper project, but I'm not sure how to capture the data. CSV or a database? Which database? Any suggestions?
benlk, 7 months ago If I end up putting this #scraper project online, then it's probably best to use mysql, since that's guaranteed to be available on website hosts, but is there a better recommendation? @simon @palewire
If I end up putting this #scraper project online, then it's probably best to use mysql, since that's guaranteed to be available on website hosts, but is there a better recommendation?
@simon @palewire