I guess this is turning into a real project so putting this out there if anyones... - All Things RSS - Great Feeds, Readers, and more

jonny, 5 months ago

I guess this is turning into a real project so putting this out there if anyones interested:

We're making a very lightweight tool to create RSS feeds for journals from crossref metadata (with room for other sources). If ya dont know, many publishers are shutting down their RSS feeds to drive people onto their surveillance platforms, and every enshittification leaves behind an opening for adversarial interop.

This opens some interesting possibilities like creating feeds for keywords indexed across journals to start breaking down journals as the major organizational scheme of scholarly lit - papers have metadata keywords, but they mostly arent used, so lets use them!

Eventually wed like to write a FastAPI plugin similar to activitypub-express so we can make all feeds available on the fedi as well, and that would be a really nice set of tools to build for smaller AP projects that dont necessarily want to be full instances.

This is designed to be extremely deployable so you can run your own feed generator, but we'll also host a reference instance here at feeds.neuromatch.social once we get it running.

Just getting started, help wanted and welcome from anyone who loves #RSS and reading papers ♥

Repo: https://github.com/sneakers-the-rat/journal-rss

Cc @lili and @roaldarboel

Stems from this thread: https://neuromatch.social/@jonny/111668885237921256

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ LegalizeBrain, matthewskelton, slothrop, Wraithe +19 more

Image

Image alternative text

matthewskelton, 5 months ago

@jonny "every enshittification leaves behind an opening for adversarial interop" 💥

@lili @roaldarboel

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ matthewskelton

markwilliams, 5 months ago

@jonny @lili @roaldarboel
This looks like a great project! #RSS and reading papers is something we've been exploring with @sciety for preprints.

Lists each have a feed, so you could follow every reviewed preprint in Neuroscience by eLife https://labs.sciety.org/lists/by-id/3253c905-8083-4f3d-9e1f-0a8085e64ee5

We also have some curated feeds eg https://labs.sciety.org/feeds/by-name/neuroscience

...and you can create an RSS feed for (evaluated) preprints from a search term on our labs site https://labs.sciety.org/search?evaluated_only=true so you can get more nuanced results.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@markwilliams @sciety
oo taking a look at your providers here and this is definitely interesting. thx for sharing :)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@markwilliams @sciety yep this rocks, a lot of this is on my todo list to figure out how to do and really useful to made a model like this!!!!!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

markwilliams, 5 months ago

@jonny @sciety Awesome. Code in https://github.com/sciety/sciety-labs. We'd love some more input from the research community into what's useful here. I don't expect we'd necessarily support journals with a traditional publishing model, but if there's a desire to implement more of the RSS functionality on our production application, sciety.org we certainly could.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@markwilliams
@sciety
Basically we are going for "make feeds as a connection between metadata and their zotero client/whatever reader" and journals are just a starting point for indexing. Lemme give this a few more days to scope out and see how much of side project vs jumping off point this thing is gonna be

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

petrichor, 5 months ago

@jonny
What a great idea! immediately clones the repo to poke about in it

For the RSS use case you might be interested in the joint Crossref/DataCite Event Data API:
https://www.crossref.org/services/event-data/
https://support.datacite.org/docs/eventdata-guide

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@petrichor
I have been looking to use that mysterious endpoint for some time :). You are very welcome to come play if youd like, we're just getting to the feed metadata part, wanna get all the good controlled vocabulary terms in there, and then its time to start adding more data sources!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

mpe, 5 months ago

@jonny @lili @roaldarboel please do let us know if there are things you need in the API that aren't there. We can potentially prototype them in the Labs API, as @gbilder said...

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

lili, 5 months ago

@mpe @jonny @roaldarboel @gbilder

Thank you! I started looking into the feeds, and I found that some of the journals don't have up to date articles. For instance Physical Review D has no articles past 2015: https://api.crossref.org/journals/1550-7998/works

Do you know why that might be?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

gbilder, 5 months ago

@lili @mpe @jonny @roaldarboel I Am trying to get to the bottom of this, but at first glance it looks like they switched ISSNs in our system at some point:

https://api.crossref.org/journals?query=Physical%20Review%20D

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@gbilder
@lili @mpe @roaldarboel
Yes! And the new ISSN didnt get an entry it looks like
https://neuromatch.social/@jonny/111697244953404613

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

mpe, 5 months ago

@lili @jonny @roaldarboel @gbilder not at my desk, but possibly a pagination matter - and that is only the first page?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@mpe @lili don't mean to drag you into a whole tech support convo on here, so feel free to tell us to buzz off or take it to a different venue, but something fun seems to be happening with the ISSNs :)

so eg this work from that journal has the ISSNs 2470-0010 and 2470-0029 which are not found, but the crossref query returns 1550-7998. Looks like they changed which ISSN they are submitting with their metadata because both sets of ISSNs are listed as valid

anyway we'll continue to puzzle about that, no hurry or pressure. it sounds sort of like a moment of metadata slapstick where they forgot to register their new ISSNs with y'all.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

zackbatist, 5 months ago

@jonny @lili @roaldarboel Not sure if you're already aware (apologies if you are) but internet archive scholar (https://scholar.archive.org/) provides RSS feeds for keyword search results, which are based on the fatcat (https://fatcat.wiki/) dataset, which also feeds into the @OpenAlex project (https://help.openalex.org/). Your goals may be different from theirs, but maybe this will be helpful?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bnewbold, 5 months ago

@zackbatist @jonny came here to plug this! IA scholar is FastAPI for the search API to boot: https://scholar.archive.org/api/redoc

would :+1: to building on a catalog like @OpenAlex (or fatcat) which aggregates from a bunch of sources in addition to Crossref; that aggregation/normalization is a bunch of work over multiple protocols/standards, and it is great to de-dupe the interop work

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@bnewbold
@zackbatist @OpenAlex
Ok thanks for showing that to me!!! That rocks and we'll definitely integrate w/that and add data sources. Crossref is just a good starting place but big fan of openalex

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

CCochard, 5 months ago

@jonny @lili @roaldarboel is this what you were after? @elduvelle

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@CCochard
@elduvelle was indeed the inspiration :)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

CCochard, 5 months ago

@jonny @elduvelle I don't know if that makes me very perceptive or completely oblivious...

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@CCochard
@elduvelle
Definitely the perceptive one :)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

elduvelle, 5 months ago

haha yes! Thank you for thinking of me @CCochard !
And HUGE thank you @jonny for doing this! 🥹

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@elduvelle
Thank us when it works 😉😘, ill make sure ur in the credits

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

elduvelle, 5 months ago

@jonny hahaha
“Contribution: asked for it, like a big baby”

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

gbilder, 5 months ago

Hi. We've been considering doing something with RSS in Crossref labs. See https://crossref.atlassian.net/browse/RD-11

Note one issue you will face is that many smaller journals (~17k) don't have ISSNs and are currently invisible via the current /journals/ route. We are looking to publicly expose an internal container-id that will allow users to get at these and avoid the nonsense of juggling P-ISSNs and E-ISSNs. You can see a prototype in our labs API: https://api.labs.crossref.org/journals/13907662?mailto=labs@crossref.org

Would love to talk.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@gbilder Somehow i missed your comment, sorry!!! Opened an issue here: https://github.com/sneakers-the-rat/paper-feeds/issues/28

I had just sort of assumed any journal would have an ISSN, but i also had a dark foreboding that any assumption i made about scholarly metadata would be proven flagrantly incorrect. Part of what i'm interested in experimenting here is harmonizing multiple metadata sources in a way that can preserve heterogeneity while being usable, a famously bad idea, so I am curious about what the landscape of alternatives looks like here.

An internal key seems like a good idea for crossref, but i'm also curious what these journals do use as an identifier, PID or not? is the URL the most computer-friendly identifier there? I wonder what the etiquette about registering something like an ARK in someone else's name is. I have been puzzling about this too as i try and make data structures for p2p archiving - I want to make a public archive of someone else's publicly available thing, and I want to attribute it to them in a discoverable way, make it clear that it does not come from them directly, but provide a means of verifying that it is in fact a copy of what they have (thinking about explicitly public things in highly-available archives here, future stages will be more consent-oriented). I am also sort of curious what that looks like to y'all - when they submit metadata, what is the identity they use? I assume some kind of member ID?

anyway i have lots of questions but my eyes are starting to glaze over from computer for the night

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

cameronneylon, 5 months ago

@jonny @lili @roaldarboel Hi! This is great. I'm wondering whether other filters would be in scope or not (I'm thinking specifically of filtering on organisations/institutions rather than journals). Appreciate you might want to stay focussed on the specific use case case of course!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@cameronneylon
Generalizing the feed generation system to use arbitrary metadata keys is increasingly exactly what I have in mind with this project as a v2 or v3

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

kdnyhan, 5 months ago

@jonny @cameronneylon
Re filters - I imagine you are familiar with PubMed RSS feeds? Very handy - one can turn any search into an RSS feed.

Eg for the query:

("Int J Evid Based Healthc"[Journal] OR "JBI Evid Implement"[Journal] OR "JBI Evid Synth"[Journal] OR "JBI Database System Rev Implement Rep"[Journal] OR "JBI Libr Syst Rev"[Journal]) AND (Yale[ad])

➡️

https://pubmed.ncbi.nlm.nih.gov/rss/search/1hY_wykJMhuzRuLgS8KS1PhYrZTZQNz6qlgjrXgeFR7hT1NMOW/?limit=15&utm_campaign=pubmed-2&fc=20240105095936

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@kdnyhan
@cameronneylon
I think query feeds like this are sort of out of scope for this project exactly bc they already exist lol, I think we're doing defined metadata feeds with control over data sourcing, but hey if this is the eventual form then im into it- most things I do end up developing some microsyntax like this eventually

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

zagone, 5 months ago

This is lovely. On another account I run several bots for bringing RSS feeds to Toots using IFTTT and Zapier.

Even publishers with RSS feeds block so many robots!

@jonny @lili @roaldarboel

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@zagone I respect anti-scraping measures most of the time, but with publishers I will in fact break out the tor webdriver

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

moritz_negwer, 5 months ago

@jonny @lili @roaldarboel Fantastic, thanks for making this a thing!

In related news, my Mastodon feed just brought up a RSS-to-fedi-timeline tool, maybe worth adapting? https://genart.social/@twilliability/111688775264827884

Anyways, so cool to see adversarial interoperability in action! I'll follow this with interest :)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@moritz_negwer added to an issue tracking related projects!!!!! https://github.com/sneakers-the-rat/journal-rss/issues/8 thank you <3. stealing from the rich is always fun and never boring

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

i gotta go AFK for a bit but yeehaw we have MVP. later we refine the all-crappy everything

Very plain page - search for a journal, populate a list of popular journals with buttons for "feeds" next to them. One is an RSS button feed and blam it gives you an RSS feed!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

I cannot stress enough how much htmx rocks

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

Like the last time I had to do "full stack" something something deployment had to have a whole ass separate set of scripts across like 10 frameworks, but being able to just not do a react frontend with a billion webpacked dependencies while still having reasonable interactivity is so NICE

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

Turns out someone has already done FastAPI ActivityPub plugin. https://docs.microblog.pub/
so that's happening tomorrow

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

julian, 5 months ago

@jonny Cool stuff! I don't personally care about FastAPI, but:

> Implements the ActivityPub server to server protocol
> Uses SQLite, and Python 3.10+
> GNU AGPL v3

This may help me speed up FediRoster development by a lot. Thanks for the link and cheers to @dev!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@julian
@dev
Im also not personally invested in FastAPI either but as far as being able to collaborate with my peers (academic/scientific programmers) who mostly write python, use pydantic, but dont do flask, its been very refreshing and fun to collab compared to ts + express or rails. I also am just sort of loving how @tiangolo designs things between it and sqlmodel, again refreshing. Do recommend :)

Also excited to check out @dev 's work more too - first im seeing it!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@julian
Also fediroster looks cool and I gotta check out how youre doing the proof of identity there. Ive tried to do that and bounced off it a few times bc it wasnt clear to me how to OAUTH with arbitrary instances.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

julian, 5 months ago

@jonny Thank you! 😀

Mastodon has an API that lets remote servers register new applications, which users can then sign into. It feels a bit strange and probably isn't exactly how OAuth was conceptualized, but it works. 🤷 I think @thisismissem would know more about the details than I do, she's definitely posted about it before.

Anyway, FediRoster offers account validation via OAuth or via DMing a one-off code to a Mastodon bot. But I want to revamp the process for 1.0: https://fietkau.social/@julian/111674347300497688

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

thisismissem, 5 months ago

@julian @jonny I can't give any advice today, but maybe once I'm feeling well again I can.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@thisismissem
@julian
Please rest ♥

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

NikaShilobod, 5 months ago

@jonny
@lili @roaldarboel

I LOVE THIS. It's the tool I've been looking for forever. Will have to take a poke around when I can, may be a bit though.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

NikaShilobod, 5 months ago

@jonny
@lili @roaldarboel

PS I've collected some RSS meta lists already I think. I'll see if I can dig them out if you want them.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@NikaShilobod thank you!!! specifically not trying to index existing RSS feeds (since they already exist and fill the need) but make a start towards actively using uncopyrightable metadata to break systems that make it so that we are not at the mercy of publishers for having or not having feeds <3. any metadata source is welcome tho, working to make that general after we get the basic framing out of the way

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

NikaShilobod, 5 months ago

@jonny Also, have you guys sorted some kind of diagram on how you want this program to be orchestrated? I am teaching myself a lot and appreciate the visuals to know where I am useful. I am not allowing myself to touch this until next month, though!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@NikaShilobod
Just getting started literally in the last couple days! The diagram is hopefully p simple on this one since its pretty much data source -> (interface stuff) -> feed

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

NikaShilobod, 5 months ago

@jonny Also, look at how these guys manage to pull feeds. There may be some source code you can translate: https://github.com/shevabam/get-rss-feed-url-extension

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@NikaShilobod
Maybe! The problem here is sort of exactly that there arent RSS feeds sometimes, but the commercial circumstances of the publishing industry still require them to publish non-copyrightable metadata on platforms largely maintained by librarians and library-adjacent orgs, so exploiting an opening, but if there is generalized scraping code anywhere im always game to read it

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

irenes, 5 months ago

@jonny @lili @roaldarboel oh hey! we're excited to hear about this, it's important stuff!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@irenes @lili @roaldarboel sometimes you spend 4 years working on something and nobody cares, sometimes you spend 4 hours on something and everybody cares <3

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ elduvelle

roaldarboel, 5 months ago

@jonny @irenes @lili How I felt with my LaTeX template 😂

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@roaldarboel
Plz all TeX work is sacred ♥

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@roaldarboel (and i love your template)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

neuralreckoning, 5 months ago

@jonny @lili @roaldarboel I'd love to have the time and energy to help with this but I have to be realistic. 😞 I will try to comment on those issues you mentioned to me though!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@neuralreckoning
@lili @roaldarboel
Dw we got this one. Repo will be there if you catch some inspo

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

citc, 5 months ago

Do the journal have to be open access?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 5 months ago

@citc nope ;)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Add comment