jonny,
@jonny@neuromatch.social avatar

I guess this is turning into a real project so putting this out there if anyones interested:

We're making a very lightweight tool to create RSS feeds for journals from crossref metadata (with room for other sources). If ya dont know, many publishers are shutting down their RSS feeds to drive people onto their surveillance platforms, and every enshittification leaves behind an opening for adversarial interop.

This opens some interesting possibilities like creating feeds for keywords indexed across journals to start breaking down journals as the major organizational scheme of scholarly lit - papers have metadata keywords, but they mostly arent used, so lets use them!

Eventually wed like to write a FastAPI plugin similar to activitypub-express so we can make all feeds available on the fedi as well, and that would be a really nice set of tools to build for smaller AP projects that dont necessarily want to be full instances.

This is designed to be extremely deployable so you can run your own feed generator, but we'll also host a reference instance here at feeds.neuromatch.social once we get it running.

Just getting started, help wanted and welcome from anyone who loves and reading papers ♥

Repo: https://github.com/sneakers-the-rat/journal-rss

Cc @lili and @roaldarboel

Stems from this thread: https://neuromatch.social/@jonny/111668885237921256

matthewskelton,
@matthewskelton@mastodon.social avatar

@jonny "every enshittification leaves behind an opening for adversarial interop" 💥

@lili @roaldarboel

markwilliams,

@jonny @lili @roaldarboel
This looks like a great project! and reading papers is something we've been exploring with @sciety for preprints.

Lists each have a feed, so you could follow every reviewed preprint in Neuroscience by eLife https://labs.sciety.org/lists/by-id/3253c905-8083-4f3d-9e1f-0a8085e64ee5

We also have some curated feeds eg https://labs.sciety.org/feeds/by-name/neuroscience

...and you can create an RSS feed for (evaluated) preprints from a search term on our labs site https://labs.sciety.org/search?evaluated_only=true so you can get more nuanced results.

jonny,
@jonny@neuromatch.social avatar

@markwilliams @sciety
oo taking a look at your providers here and this is definitely interesting. thx for sharing :)

jonny,
@jonny@neuromatch.social avatar

@markwilliams @sciety yep this rocks, a lot of this is on my todo list to figure out how to do and really useful to made a model like this!!!!!

markwilliams,

@jonny @sciety Awesome. Code in https://github.com/sciety/sciety-labs. We'd love some more input from the research community into what's useful here. I don't expect we'd necessarily support journals with a traditional publishing model, but if there's a desire to implement more of the RSS functionality on our production application, sciety.org we certainly could.

jonny,
@jonny@neuromatch.social avatar

@markwilliams
@sciety
Basically we are going for "make feeds as a connection between metadata and their zotero client/whatever reader" and journals are just a starting point for indexing. Lemme give this a few more days to scope out and see how much of side project vs jumping off point this thing is gonna be

petrichor,
@petrichor@digipres.club avatar

@jonny
What a great idea! immediately clones the repo to poke about in it

For the RSS use case you might be interested in the joint Crossref/DataCite Event Data API:
https://www.crossref.org/services/event-data/
https://support.datacite.org/docs/eventdata-guide

jonny,
@jonny@neuromatch.social avatar

@petrichor
I have been looking to use that mysterious endpoint for some time :). You are very welcome to come play if youd like, we're just getting to the feed metadata part, wanna get all the good controlled vocabulary terms in there, and then its time to start adding more data sources!

mpe,
@mpe@ravenation.club avatar

@jonny @lili @roaldarboel please do let us know if there are things you need in the API that aren't there. We can potentially prototype them in the Labs API, as @gbilder said...

lili,

@mpe @jonny @roaldarboel @gbilder

Thank you! I started looking into the feeds, and I found that some of the journals don't have up to date articles. For instance Physical Review D has no articles past 2015: https://api.crossref.org/journals/1550-7998/works

Do you know why that might be?

gbilder,

@lili @mpe @jonny @roaldarboel I Am trying to get to the bottom of this, but at first glance it looks like they switched ISSNs in our system at some point:

https://api.crossref.org/journals?query=Physical%20Review%20D

jonny,
@jonny@neuromatch.social avatar

@gbilder
@lili @mpe @roaldarboel
Yes! And the new ISSN didnt get an entry it looks like
https://neuromatch.social/@jonny/111697244953404613

mpe,
@mpe@ravenation.club avatar

@lili @jonny @roaldarboel @gbilder not at my desk, but possibly a pagination matter - and that is only the first page?

jonny,
@jonny@neuromatch.social avatar

@mpe @lili don't mean to drag you into a whole tech support convo on here, so feel free to tell us to buzz off or take it to a different venue, but something fun seems to be happening with the ISSNs :)

so eg this work from that journal has the ISSNs 2470-0010 and 2470-0029 which are not found, but the crossref query returns 1550-7998. Looks like they changed which ISSN they are submitting with their metadata because both sets of ISSNs are listed as valid

anyway we'll continue to puzzle about that, no hurry or pressure. it sounds sort of like a moment of metadata slapstick where they forgot to register their new ISSNs with y'all.

zackbatist,

@jonny @lili @roaldarboel Not sure if you're already aware (apologies if you are) but internet archive scholar (https://scholar.archive.org/) provides RSS feeds for keyword search results, which are based on the fatcat (https://fatcat.wiki/) dataset, which also feeds into the @OpenAlex project (https://help.openalex.org/). Your goals may be different from theirs, but maybe this will be helpful?

bnewbold,

@zackbatist @jonny came here to plug this! IA scholar is FastAPI for the search API to boot: https://scholar.archive.org/api/redoc

would :+1: to building on a catalog like @OpenAlex (or fatcat) which aggregates from a bunch of sources in addition to Crossref; that aggregation/normalization is a bunch of work over multiple protocols/standards, and it is great to de-dupe the interop work

jonny,
@jonny@neuromatch.social avatar

@bnewbold
@zackbatist @OpenAlex
Ok thanks for showing that to me!!! That rocks and we'll definitely integrate w/that and add data sources. Crossref is just a good starting place but big fan of openalex

CCochard,
@CCochard@mastodon.social avatar

@jonny @lili @roaldarboel is this what you were after? @elduvelle

jonny,
@jonny@neuromatch.social avatar

@CCochard
@elduvelle was indeed the inspiration :)

CCochard,
@CCochard@mastodon.social avatar

@jonny @elduvelle I don't know if that makes me very perceptive or completely oblivious...

jonny,
@jonny@neuromatch.social avatar

@CCochard
@elduvelle
Definitely the perceptive one :)

elduvelle,
@elduvelle@neuromatch.social avatar

haha yes! Thank you for thinking of me @CCochard !
And HUGE thank you @jonny for doing this! 🥹

jonny,
@jonny@neuromatch.social avatar

@elduvelle
Thank us when it works 😉😘, ill make sure ur in the credits

elduvelle,
@elduvelle@neuromatch.social avatar

@jonny hahaha
“Contribution: asked for it, like a big baby”

gbilder,

Hi. We've been considering doing something with RSS in Crossref labs. See https://crossref.atlassian.net/browse/RD-11

Note one issue you will face is that many smaller journals (~17k) don't have ISSNs and are currently invisible via the current /journals/ route. We are looking to publicly expose an internal container-id that will allow users to get at these and avoid the nonsense of juggling P-ISSNs and E-ISSNs. You can see a prototype in our labs API: https://api.labs.crossref.org/journals/13907662?mailto=labs@crossref.org

Would love to talk.

jonny,
@jonny@neuromatch.social avatar

@gbilder Somehow i missed your comment, sorry!!! Opened an issue here: https://github.com/sneakers-the-rat/paper-feeds/issues/28

I had just sort of assumed any journal would have an ISSN, but i also had a dark foreboding that any assumption i made about scholarly metadata would be proven flagrantly incorrect. Part of what i'm interested in experimenting here is harmonizing multiple metadata sources in a way that can preserve heterogeneity while being usable, a famously bad idea, so I am curious about what the landscape of alternatives looks like here.

An internal key seems like a good idea for crossref, but i'm also curious what these journals do use as an identifier, PID or not? is the URL the most computer-friendly identifier there? I wonder what the etiquette about registering something like an ARK in someone else's name is. I have been puzzling about this too as i try and make data structures for p2p archiving - I want to make a public archive of someone else's publicly available thing, and I want to attribute it to them in a discoverable way, make it clear that it does not come from them directly, but provide a means of verifying that it is in fact a copy of what they have (thinking about explicitly public things in highly-available archives here, future stages will be more consent-oriented). I am also sort of curious what that looks like to y'all - when they submit metadata, what is the identity they use? I assume some kind of member ID?

anyway i have lots of questions but my eyes are starting to glaze over from computer for the night

cameronneylon,

@jonny @lili @roaldarboel Hi! This is great. I'm wondering whether other filters would be in scope or not (I'm thinking specifically of filtering on organisations/institutions rather than journals). Appreciate you might want to stay focussed on the specific use case case of course!

jonny,
@jonny@neuromatch.social avatar

@cameronneylon
Generalizing the feed generation system to use arbitrary metadata keys is increasingly exactly what I have in mind with this project as a v2 or v3

kdnyhan,
@kdnyhan@social.esmarconf.org avatar

@jonny @cameronneylon
Re filters - I imagine you are familiar with PubMed RSS feeds? Very handy - one can turn any search into an RSS feed.

Eg for the query:

("Int J Evid Based Healthc"[Journal] OR "JBI Evid Implement"[Journal] OR "JBI Evid Synth"[Journal] OR "JBI Database System Rev Implement Rep"[Journal] OR "JBI Libr Syst Rev"[Journal]) AND (Yale[ad])

➡️

https://pubmed.ncbi.nlm.nih.gov/rss/search/1hY_wykJMhuzRuLgS8KS1PhYrZTZQNz6qlgjrXgeFR7hT1NMOW/?limit=15&utm_campaign=pubmed-2&fc=20240105095936

jonny,
@jonny@neuromatch.social avatar

@kdnyhan
@cameronneylon
I think query feeds like this are sort of out of scope for this project exactly bc they already exist lol, I think we're doing defined metadata feeds with control over data sourcing, but hey if this is the eventual form then im into it- most things I do end up developing some microsyntax like this eventually

zagone,

This is lovely. On another account I run several bots for bringing RSS feeds to Toots using IFTTT and Zapier.

Even publishers with RSS feeds block so many robots!

@jonny @lili @roaldarboel

jonny,
@jonny@neuromatch.social avatar

@zagone I respect anti-scraping measures most of the time, but with publishers I will in fact break out the tor webdriver

moritz_negwer,
@moritz_negwer@mstdn.science avatar

@jonny @lili @roaldarboel Fantastic, thanks for making this a thing!

In related news, my Mastodon feed just brought up a RSS-to-fedi-timeline tool, maybe worth adapting? https://genart.social/@twilliability/111688775264827884

Anyways, so cool to see adversarial interoperability in action! I'll follow this with interest :)

jonny,
@jonny@neuromatch.social avatar

@moritz_negwer added to an issue tracking related projects!!!!! https://github.com/sneakers-the-rat/journal-rss/issues/8 thank you <3. stealing from the rich is always fun and never boring

jonny,
@jonny@neuromatch.social avatar
jonny,
@jonny@neuromatch.social avatar

I cannot stress enough how much htmx rocks

jonny,
@jonny@neuromatch.social avatar

Like the last time I had to do "full stack" something something deployment had to have a whole ass separate set of scripts across like 10 frameworks, but being able to just not do a react frontend with a billion webpacked dependencies while still having reasonable interactivity is so NICE

jonny,
@jonny@neuromatch.social avatar

Turns out someone has already done FastAPI ActivityPub plugin. https://docs.microblog.pub/
so that's happening tomorrow

julian,
@julian@fietkau.social avatar

@jonny Cool stuff! I don't personally care about FastAPI, but:

> Implements the ActivityPub server to server protocol
> Uses SQLite, and Python 3.10+
> GNU AGPL v3

This may help me speed up FediRoster development by a lot. Thanks for the link and cheers to @dev!

jonny,
@jonny@neuromatch.social avatar

@julian
@dev
Im also not personally invested in FastAPI either but as far as being able to collaborate with my peers (academic/scientific programmers) who mostly write python, use pydantic, but dont do flask, its been very refreshing and fun to collab compared to ts + express or rails. I also am just sort of loving how @tiangolo designs things between it and sqlmodel, again refreshing. Do recommend :)

Also excited to check out @dev 's work more too - first im seeing it!

jonny,
@jonny@neuromatch.social avatar

@julian
Also fediroster looks cool and I gotta check out how youre doing the proof of identity there. Ive tried to do that and bounced off it a few times bc it wasnt clear to me how to OAUTH with arbitrary instances.

julian,
@julian@fietkau.social avatar

@jonny Thank you! 😀

Mastodon has an API that lets remote servers register new applications, which users can then sign into. It feels a bit strange and probably isn't exactly how OAuth was conceptualized, but it works. 🤷 I think @thisismissem would know more about the details than I do, she's definitely posted about it before.

Anyway, FediRoster offers account validation via OAuth or via DMing a one-off code to a Mastodon bot. But I want to revamp the process for 1.0: https://fietkau.social/@julian/111674347300497688

thisismissem,
@thisismissem@hachyderm.io avatar

@julian @jonny I can't give any advice today, but maybe once I'm feeling well again I can.

jonny,
@jonny@neuromatch.social avatar

@thisismissem
@julian
Please rest ♥

NikaShilobod,

@jonny
@lili @roaldarboel

I LOVE THIS. It's the tool I've been looking for forever. Will have to take a poke around when I can, may be a bit though.

NikaShilobod,

@jonny
@lili @roaldarboel

PS I've collected some RSS meta lists already I think. I'll see if I can dig them out if you want them.

jonny,
@jonny@neuromatch.social avatar

@NikaShilobod thank you!!! specifically not trying to index existing RSS feeds (since they already exist and fill the need) but make a start towards actively using uncopyrightable metadata to break systems that make it so that we are not at the mercy of publishers for having or not having feeds <3. any metadata source is welcome tho, working to make that general after we get the basic framing out of the way

NikaShilobod,

@jonny Also, have you guys sorted some kind of diagram on how you want this program to be orchestrated? I am teaching myself a lot and appreciate the visuals to know where I am useful. I am not allowing myself to touch this until next month, though!

jonny,
@jonny@neuromatch.social avatar

@NikaShilobod
Just getting started literally in the last couple days! The diagram is hopefully p simple on this one since its pretty much data source -> (interface stuff) -> feed

NikaShilobod,

@jonny Also, look at how these guys manage to pull feeds. There may be some source code you can translate: https://github.com/shevabam/get-rss-feed-url-extension

jonny,
@jonny@neuromatch.social avatar

@NikaShilobod
Maybe! The problem here is sort of exactly that there arent RSS feeds sometimes, but the commercial circumstances of the publishing industry still require them to publish non-copyrightable metadata on platforms largely maintained by librarians and library-adjacent orgs, so exploiting an opening, but if there is generalized scraping code anywhere im always game to read it

irenes,

@jonny @lili @roaldarboel oh hey! we're excited to hear about this, it's important stuff!

jonny,
@jonny@neuromatch.social avatar

@irenes @lili @roaldarboel sometimes you spend 4 years working on something and nobody cares, sometimes you spend 4 hours on something and everybody cares <3

roaldarboel,

@jonny @irenes @lili How I felt with my LaTeX template 😂

jonny,
@jonny@neuromatch.social avatar

@roaldarboel
Plz all TeX work is sacred ♥

jonny,
@jonny@neuromatch.social avatar

@roaldarboel (and i love your template)

neuralreckoning,
@neuralreckoning@neuromatch.social avatar

@jonny @lili @roaldarboel I'd love to have the time and energy to help with this but I have to be realistic. 😞 I will try to comment on those issues you mentioned to me though!

jonny,
@jonny@neuromatch.social avatar

@neuralreckoning
@lili @roaldarboel
Dw we got this one. Repo will be there if you catch some inspo

citc,

Do the journal have to be open access?

jonny,
@jonny@neuromatch.social avatar

@citc nope ;)

  • All
  • Subscribed
  • Moderated
  • Favorites
  • RSS
  • ethstaker
  • DreamBathrooms
  • cubers
  • osvaldo12
  • mdbf
  • magazineikmin
  • normalnudes
  • InstantRegret
  • rosin
  • Youngstown
  • slotface
  • khanakhh
  • kavyap
  • ngwrru68w68
  • JUstTest
  • everett
  • cisconetworking
  • tacticalgear
  • anitta
  • thenastyranch
  • Durango
  • tester
  • GTA5RPClips
  • modclub
  • megavids
  • provamag3
  • Leos
  • lostlight
  • All magazines