lambdamoses,
@lambdamoses@fosstodon.org avatar

Bioinformaticians: If you see an R package in bioinformatics that is in CRAN instead of Bioconductor, does this raise a bit of suspicion? Why or why not?

devSJR,
@devSJR@fosstodon.org avatar

@lambdamoses It does not make me suspicious. For me it is easier to deploy a package via CRAN. Just compare
install.package("Foo")
vs.
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(version = "3.17")
BiocManager::install(c("Foo"))

https://www.bioconductor.org/install/

But, both are great resources to be honest.

Mehrad,
@Mehrad@fosstodon.org avatar

@lambdamoses
If I see anything in Bioconductor I will be super suspicious about the performance, accuracy and maintenance of the package. I try to avoid using Bioconductor packages as much as possible due to all the flaws Bioconductor has, from lack of rigorous review to lack of proper issue tracker in majority of the packages (this is because Bioconductor is trying to encourage people to use their git server).

I also encourage others to be vigilant about using Bioconductor packages.

Mehrad,
@Mehrad@fosstodon.org avatar

@lambdamoses
Perhaps I should add few points:

  1. By vigilant I don't mean hesitant. I mean vigilant. Do the background check and see if the package is fixing bugs and has some sort of testing to give you the desired output

  2. Bioconductor is a package. It comes with their own git platform, forum, ... and all this means that if you sign in for BioC you will be able to utilize all that. But have you ever tried to report a bug? A large portion of packages have no issue tracker attached.

Mehrad,
@Mehrad@fosstodon.org avatar

@lambdamoses
3. Any code in any platform can be written in a bad way, but in data analysis the most important things are accuracy and reproducibility. BioC does not have a way to track changes (NEWS and updates) in the packages one use. In CRAN you can write a crawler in 5 minutes to work with the website and pull the news for you. BioC website is ... well, look for yourself.

Lluis_Revilla,
@Lluis_Revilla@fosstodon.org avatar

@Mehrad @lambdamoses Bioconductor reports the updates of all packages in each release. For example: https://bioconductor.org/news/bioc_3_17_release/ And tracks changes with git (which CRAN doesn't)

You can look at all the code in https://code.bioconductor.org/

Yes, the website is bad but not worse than others. It is under transformation/update. You can provide feedback via this form: https://t.maze.co/179032315

Mehrad, (edited )
@Mehrad@fosstodon.org avatar

@Lluis_Revilla
Thanks for the comment👍🏼 but:

  1. I as a user don't care about the full release of BioC. I want to know about the packages I personally care about and I want to have them as soon as they are in release state (not BioC release)

  2. This is the exact reason I used all caps for NEWS, because I didn't meant the code changes and diffs

  3. BioC provides git without any of the QoL parts like issue tracker, CI/CD, ... What BioC does imho is of no use and partly negligence

@lambdamoses

Mehrad,
@Mehrad@fosstodon.org avatar

@Lluis_Revilla
...Things like SourceForge are doing much better in terms of connecting users to developers compared to BioC.

Regarding the website, it is what it is. When those who care about BioC don't care, why should I? If comments were welcome, they would have been a proper transparent channel to discuss this (also, your link says "This maze is no longer available").

I wrote a scraper for BioC (no API available) and that was when I realized how messy the structure is.

@lambdamoses

Mehrad,
@Mehrad@fosstodon.org avatar

@Lluis_Revilla
So to wrap up, BioC started back in the day with a solid and cool idea, but that was were it stopped to be good.

Of course this is a free world and full of FLOSS projects and people should use what they feel comfortable with. I for one try my best to stick with a software that is well-maintained and the existing and past issues are known and publicly documented (not in maintainer's email!) This builds trust imho.

Hopefully BioC one day gets up to my standards🤗

@lambdamoses

Lluis_Revilla,
@Lluis_Revilla@fosstodon.org avatar

@Mehrad @lambdamoses I am not sure which are those standards. Could you provide a list or some criteria? (CRAN doesn't have an API, does not provide a way to connect users with developers or to get the NEWS of a developing package).

If you want you can come and join meeting about the state and future of R repositories at the R repositories working group (https://github.com/RConsortium/r-repositories-wg).

Thanks for your constructive criticism! I think the quality standards will improve if we work on it.

Mehrad, (edited )
@Mehrad@fosstodon.org avatar

@Lluis_Revilla
This is very interesting and exciting. I definitely would support the cause.
There is not enough info on that repo about the event timing and organization. Based on the timing of the MoM the next one should be in any day now. Should I just create the PR and add my name or there is a procedure to go through in order to participate?

Lluis_Revilla,
@Lluis_Revilla@fosstodon.org avatar

@Mehrad The next one is currently ongoing. I can send you a link. But better if you publicly post it (so that the organizer is aware and adds you to the mail distribution)

Lluis_Revilla,
@Lluis_Revilla@fosstodon.org avatar

@Mehrad @lambdamoses
I agree relationship between users and developers could be improved in Bioconductor.

Bioconductor users and developers care about the website. That's why I was providing a link to get comments also from others not so much involved with it.
Sorry, I posted the link from a channel but it seems it is now closed. Maybe @bioconductor could open it again.

What were you trying to scrape? I am doing it via available.packages and it works without problems for many months already.

Mehrad,
@Mehrad@fosstodon.org avatar

@Lluis_Revilla
I tried to find my old post but I got tired of scrolling. The reason I was scraping BioC was that they had a very bold claim that BioC packages are very well maintained and there is always a good way to report bugs and issues. I scraped all CRAN and BioC and extracted the package info presented on the websites and got the stats on how many packages on either platforms have indeed a proper issue tracker. The result was simple: CRAN won by a large margin but there is a big BUT
...

Mehrad,
@Mehrad@fosstodon.org avatar

@Lluis_Revilla
... The issue was that BioC does not enforce the formatting, and people was putting the git repo link in various fields and also there were nonsense URLs as well. So I ended up ignoring the whole bold claim and "assume" BioC didn't lie. At least I played it ethical in the sense that I accepted that my data collection was too noisy to get to a solid conclusion (during a weekend project).

Anyways, if I don't know the history of bugs of a package, it means I cannot later go back...

Lluis_Revilla,
@Lluis_Revilla@fosstodon.org avatar

@Mehrad you can have your own set of test for the packages to ensure they maintain the behavior you expect. (I'm not doing that but I might do this if I'm expecting to host a shiny app or an internal repository for long)

Mehrad,
@Mehrad@fosstodon.org avatar

@Lluis_Revilla
... and trust my results. in "best" case scenario I can only know about the bugs that have been fixed, not the bugs that have been reported and are still dangling. This is a serious part imho.

Anyways. CRAN has a very minimal HTML website and everything is pretty clear, but in BioC some stuff are loaded with AJAX or some other stuff dynamically which making going through things difficult and unfeasible.

Lluis_Revilla,
@Lluis_Revilla@fosstodon.org avatar

@Mehrad I think I remember your post.

Yes, sometimes the leadership makes very bold claims not supported. Like saying vignettes are the best, while some are a link to a book in github that do not longer work... But you don't need to scrape Bioconductor to get that info! And there is nothing in Bioconductor preventing maintainers to add the info in a field in DESCRIPTION.

_mdoyle_,

@Lluis_Revilla @Mehrad @lambdamoses I've reopened the Bioconductor website survey link (https://t.maze.co/179032315) if you'd like to give feedback, you're very welcome

Mehrad,
@Mehrad@fosstodon.org avatar

@_mdoyle_
Thanks for putting the efforts into this, but now I have some burning questions that I would be thankful if you kindly provide answers to:

  1. does it hurt if the actual link is used instead of a shortened URL?
  2. does it hurt if a FLOSS platform is used for shortening the URL?
  3. Why a feedback for have an expiration date? Isn't feedback always welcome in BioC?

(1/2)
@lambdamoses @Lluis_Revilla

Mehrad,
@Mehrad@fosstodon.org avatar

@_mdoyle_

  1. If this is a "post-launch" survey, why it links to "preview"?

  2. The form says it is expired two days ago

  3. I genuinely don't know if you are mocking me here, but it would have been great if the bitly link would actually work! The bitly URL expands to the following which... look for yourself!
    https://redesign-bioconductor-staging.s3-website-us-east-1.amazonaws.com/

Please for now consider these a form of feedback, as I have spent way more than 5 mins and still am not passed the first page.

(2/2)
@lambdamoses @Lluis_Revilla

Lluis_Revilla,
@Lluis_Revilla@fosstodon.org avatar

@Mehrad Thanks for voicing this concerns. I have similar (and more) concerns but didn't know how to write them politely.

@_mdoyle_ @lambdamoses

_mdoyle_,

@Mehrad
Thank you for taking the time to provide feedback. We genuinely appreciate the effort you put into examining our survey closely. Let me address your questions and concerns:

  1. Using an actual link doesn't hurt, we used a shortened URL to make it easier to remember the link or copy it down.
  2. Using a FLOSS platform for shortening the URL is certainly an option and we'll consider this for future.

(1/3)
@lambdamoses @Lluis_Revilla

Lluis_Revilla,
@Lluis_Revilla@fosstodon.org avatar

@Mehrad @lambdamoses

  1. As a user should be using the full release of BioC. That's the idea behind the scheduled release. You are of course free to not do it but at your own dangers. If you disagree with this policy, that's the beauty of repos

  2. You can get the news of any package via news() once installed. If you want to look them before installing them you can find them at https://bioconductor.org/packages/devel/bioc/news/<package>/NEWS

lambdamoses,
@lambdamoses@fosstodon.org avatar

deleted_by_author

  • Loading...
  • Mehrad,
    @Mehrad@fosstodon.org avatar

    @lambdamoses
    Don't get me wrong, the scraping is solely a by-product of lack of API, and it revealed poor web design. The sacrifice performance and organization to beauty.

    Also, there are a ton a packages on BioC that do not have a issue tracker linked. So yeah, and I was told by BioC folks here on fediverse to report the issue via email! Well, I have had unreliable experience with BioC packages so far and let's say that they are not up to my standards regarding reliability and transparency.

    Mehrad,
    @Mehrad@fosstodon.org avatar

    @lambdamoses
    I also asked question on forum and after quite some time, the dev responded in a very egotistical way. After I proved that this is a bug and he fixed it, I had to wait until next release 🙄 so I manually patched it locally. We have version numbers for a reason. We don't need extra layer of redundancy. If someone wants generations, they should use Guix time-machine to keep everything accurately in check, and not matching rolling release of CRAN with point-release of BioC.

    lambdamoses,
    @lambdamoses@fosstodon.org avatar

    Well, at least that's better than the wild west of GitHub, right? I still wonder why most R package developers, at least in spatial transcriptomics, don't get their packages to CRAN or Bioconductor.

    brodriguesco,
    @brodriguesco@fosstodon.org avatar

    @lambdamoses I’m not in bioinformatics, but I’m curious: why would it seem suspicious if a package is on CRAN instead of bioconductor?

    lambdamoses,
    @lambdamoses@fosstodon.org avatar

    deleted_by_author

  • Loading...
  • brodriguesco,
    @brodriguesco@fosstodon.org avatar

    @lambdamoses Interesting, I wasn’t aware that the bar was even higher than CRAN! I understand your initial statement better now.

    Mehrad,
    @Mehrad@fosstodon.org avatar

    @lambdamoses
    Interesting. My personal experience as a bioinformatician with about a decade of dealing with CRAN and Bioconductor package submission is exactly the opposite of what you explained.

    Also, one major issue with BioC is that they force packages to use certain data structures that themselves published, which is putting unnecessary burden on developers and gaining the BioC more citation! At least they did that to the packages we were submitting.

    @brodriguesco

    brodriguesco,
    @brodriguesco@fosstodon.org avatar

    @Mehrad @lambdamoses ok so now I’m confused 😂 Interesting discussion though, looking forward to other people weighing in

    Lluis_Revilla,
    @Lluis_Revilla@fosstodon.org avatar

    @Mehrad @lambdamoses @brodriguesco The idea behind the repository is to have a higher integration between packages so that users have easier usage of the tools. This is easier if packages uses the same classes or derived classes from core ones.
    I am part of a working group trying to address the problems with classes in Bioconductor (sometimes people also tend to create classes when there is no need). I don't think one should cite all classes they use if it is not new/relevant when publishing.

    devSJR,
    @devSJR@fosstodon.org avatar

    @lambdamoses @brodriguesco @Mehrad Can agree on several parts with that. Most of my packages are S3 one is S4. I never really saw a benefit for me regarding S4. On CRAN one can choose een R6, whcih is a profoundly different OO system from S3 and S4. On BioC S4 is the rule as far as I know.

    lambdamoses,
    @lambdamoses@fosstodon.org avatar

    deleted_by_author

  • Loading...
  • Mehrad,
    @Mehrad@fosstodon.org avatar

    @lambdamoses

    You should not force devs to comply to your ecosystem and you be the one who profit from it. If an object format is good, people will gradually converge in using it. I myself don't like tibble (lack of row.names doesn't sit right with me), but in spite of that it is a good data structure (comparatively), and hence many devs are using it. Freedom is the key imho. Let the ecosystem grow organically.

    lambdamoses,
    @lambdamoses@fosstodon.org avatar

    deleted_by_author

  • Loading...
  • Mehrad,
    @Mehrad@fosstodon.org avatar

    @lambdamoses To re-iterate, my issue is with forcing people to do what you want and closing their options. Also it does not help when I see certain people directly get academic virtual points through enforcing such restrictions.

    If my package is hard to work with, people will not use it. Really that simple! It should be my business and my business only how I formulate my package regarding data types and algorithm (as long as it is not malicious). People will vote by their usage.

    Do you agree?

    lambdamoses,
    @lambdamoses@fosstodon.org avatar

    deleted_by_author

  • Loading...
  • lambdamoses,
    @lambdamoses@fosstodon.org avatar

    @Mehrad @Lluis_Revilla think it would be nice to have some regular meetings with users and developers of all walks of life to discuss and build such a standard, if you don't like SCE or AnnData. It's just like standards in hardware and how much trouble lack of a standard causes. For example, in cycling, you can't really mix Shimano and SRAM and newer components are incompatible with older components, so swapping one part causes you to swap many others, even when the others still otherwise work.

    lambdamoses,
    @lambdamoses@fosstodon.org avatar

    @Mehrad @Lluis_Revilla Imagine on a bike ride, you buy a new inner tube to change a flat. Instead of just Presta and Schrader valves which are the standard, you have 10 different standards and need adaptors all the time to inflate the tube and sometimes the adaptors don't work. Many of those valves don't fit in your rim. Isn't that annoying? Rims and tubes conform to Presta or Schrader standards not to give creators of Presta and Schrader valves extra credit. The patents have long expired.

    Lluis_Revilla,
    @Lluis_Revilla@fosstodon.org avatar

    Bioconductor core members decided that the way to go is SummarizedExperiments and S4Vectors, now all the users and potentials users have to either push for a change or accept it and use them. Part of the tasks of the working group I mentioned previously, in my opinion, is to help establish methods to select classes for Bioconductor and when are they compulsory or not. Let's see how it goes...

    lambdamoses,
    @lambdamoses@fosstodon.org avatar

    @Lluis_Revilla But I think @Mehrad has a good point. While I personally don't complain about SummarizedExperiment and S4Vectors, how democratic is that decision? And who's in that working group?

    Mehrad,
    @Mehrad@fosstodon.org avatar

    @lambdamoses
    Just to provide extra supplementary material for what I said before, and in the light of a working link:

    Check the following link which has "ranking" of the BioC packages. pay attention to top 15 packages and think for yourself what is the criteria of such ranking and most importantly "why":

    http://redesign-bioconductor-staging.s3-website-us-east-1.amazonaws.com/packages/release/BiocViews.html#___Software

    So, yeah, that is that. Now let's agree (or agree to disagree) and move on to whichever repo we found more ethical and robust (at least for now).

    @Lluis_Revilla

    lambdamoses,
    @lambdamoses@fosstodon.org avatar

    deleted_by_author

  • Loading...
  • Mehrad,
    @Mehrad@fosstodon.org avatar

    @lambdamoses
    Really? Are you serious? All past two days you were pushing me to prove they have forced people to include their packages. This is the direct by-product. When you force your format into every package under the sun, obviously, you will become dependency of all those packages, and ultimately, your packages will be the most downloaded one as well. Then you formulate the ranking based on what gives you edge, and viola, your packages are the "top most ranked packages".

    @Lluis_Revilla

    lambdamoses,
    @lambdamoses@fosstodon.org avatar

    @Lluis_Revilla @Mehrad Actually, I think the ranking is problematic, because older packages tend to have higher ranks just because they're older because there has been more time for more people to download them. New packages usually have lower ranking. That's my understanding, correct me if I'm wrong.

    Mehrad,
    @Mehrad@fosstodon.org avatar

    @lambdamoses
    I don't know about BioC, but in proper repos, they have two type of download count: absolute total, and moving window (weekly or monthly). For instance my oldest package has around 9.3k downloads per month, and 405k in total, and newest has 225 Dpm and 11k in total. The moving window kinda correct for the bias you mentioned.

    BioC is BioC, I don't care about them, and it seems neither do they themselves.

    I again suggest you to be vigilant about the repo.
    Good night

    @Lluis_Revilla

    lambdamoses,
    @lambdamoses@fosstodon.org avatar

    deleted_by_author

  • Loading...
  • Mehrad,
    @Mehrad@fosstodon.org avatar

    @lambdamoses
    > Where did you get that intention?

    From the package submission process and when they forced us to do so (I mentioned this before)

    > I have always understood their wanting us to use their classes as trying to make a standard when the R developer community

    And you think forcing people is the solution?

    > at least in single cell, is unable to converge on a standard for whatever reason.

    Can it be for the reason that those "standard" formats are not good and people don't like them?

    devSJR,
    @devSJR@fosstodon.org avatar

    @lambdamoses
    Having myself some packages on CRAN I guess just because it easier to have them published on GitHub. The quality control at CRAN is hard but worth it.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • bioinformatics
  • ethstaker
  • DreamBathrooms
  • cubers
  • mdbf
  • everett
  • magazineikmin
  • Durango
  • Youngstown
  • rosin
  • slotface
  • modclub
  • kavyap
  • GTA5RPClips
  • ngwrru68w68
  • JUstTest
  • thenastyranch
  • cisconetworking
  • khanakhh
  • osvaldo12
  • InstantRegret
  • Leos
  • tester
  • tacticalgear
  • normalnudes
  • anitta
  • megavids
  • provamag3
  • lostlight
  • All magazines