@henrikbengtsson@mastodon.social
@henrikbengtsson@mastodon.social avatar

henrikbengtsson

@henrikbengtsson@mastodon.social

CS/Math Stat/UCSF Assoc Prof
R Foundation/R Consortium
science/reproducibility/devel
http://jottr.org, http://futureverse.org
hang gliding/paragliding
born at 330ppm
he/him
#rstats

This profile is from a federated server and may be incomplete. Browse more on the original instance.

hrbrmstr, to random
@hrbrmstr@mastodon.social avatar

🚨Looks like was not immune to deserialization bugs after all https://hiddenlayer.com/research/r-bitrary-code-execution/

Watch those R Data files (and, we now shld come up with better ways to ensure local R library integrity)!!

CVE-2024-27322

henrikbengtsson,
@henrikbengtsson@mastodon.social avatar

@rmflight @hrbrmstr Yeah, that looks to be what protects against this in R (>= 4.4.0), i.e. preventing unserialize()/readRDS() from returning a promise.

One could argue that a R 4.3.4 patching this for the now old release R 4.3 should be made.

To nitpick on the report, "Once the malicious file has been created and loaded by R, the exploit will run no matter how the variable is referenced" is not 100% correct. You can always remove promises, e.g.

> delayedAssign("a", stop("boom"))
> rm(a)
>

henrikbengtsson, to random
@henrikbengtsson@mastodon.social avatar

Wow! A giant cleanup day at CRAN today. 123 packages were archived on a single day, i.e. 0.6% of all CRAN packages were dropped today. We normally see 2-4 per day.

https://www.cranhaven.org/dashboard-live.html

henrikbengtsson, to random
@henrikbengtsson@mastodon.social avatar

One way to help the community & its package maintainers:

Packages get archived on CRAN all the time for different reasons. For 10-20% of them, the maintainer might not even be aware, because their email address no longer works. For recently archived packages and reasons, see https://cranhaven.org/.

You can help by finding alternative ways to notify the maintainer, e.g. post to their GitHub, Gitlab, … issue tracker, notify them on social media, or email them on an alternative address.

gvwilson, to random
@gvwilson@mastodon.social avatar

Should I teach bash, fish, or Nushell to data scientists who want to go beyond the basics of shell scripting? There seems to be a clear spectrum from "ubiquitous but m'gawd" to "this is the future but m'gawd in a different way".

henrikbengtsson,
@henrikbengtsson@mastodon.social avatar

@dpprdan @gvwilson I 2nd Bash; it's widely used and supported basically everywhere.

Anyone writing a Bash script should know about and use https://github.com/koalaman/shellcheck - it's an incredible tool and you learn lots of Bash from just using it - removes endless trial'n'error guessing that otherwise sticks around for years

henrikbengtsson, to random
@henrikbengtsson@mastodon.social avatar

Attention package developers:

.onAttach <- function(...) {
if (stats::runif(1) > 0.1) return()

}

is "not-so-good" code. Anything that changes the state of random number generator (RNG) on package load prevents reproducible results. It's impossible to protect against this in some situations, e.g. when running things in where the result depends on whether the package is already loaded on parallel workers.

See https://mastodon.social/@maelle/112077634681229201 by @maelle

hrbrmstr, to typst
@hrbrmstr@mastodon.social avatar

I've been fortunate enough, today, to use:

#RStats
#Quarto
#Typst
#D2
— Observable Framework

all to get super real stuff done.

Today is a good day.

henrikbengtsson,
@henrikbengtsson@mastodon.social avatar

@hrbrmstr I had a weeks long sprint with

recently and it was a joy. It reminded me what a pleasure and gem LaTeX is

henrikbengtsson, to random
@henrikbengtsson@mastodon.social avatar

package question:

Is there ever a valid reason for a package to set/modify an R option permanently, e.g. when the package is loaded? Say, options(digits = 5), options(na.action = 'na.pass'), ...

Except for options defined by the package itself, and temporary tweaks (e.g. withr::with_options()), I cannot come up with a single case where this should be allowed. On the contrary, I'm leaning toward it should be banned. But, I'm reaching out to the community to see if I missed something

henrikbengtsson, to random
@henrikbengtsson@mastodon.social avatar

FYI, CRAN requires package titles to be of length < 65 characters. Just received the following feedback for a new package:

> Please reduce the length of the title to less than 65 characters.

Do we have a community-driven place where we track these "hidden" CRAN requirements? I think there was a discussion about it, but I cannot remember if it happened.

stevensanderson, to random
@stevensanderson@mstdn.social avatar

Extract week numbers from dates in R using strftime() & %V format code or lubridate's week() function. Just pass the date object to easily get the week number. Great for wrangling temporal data! Try it out on sample dates in R to get comfortable with date manipulation for time series analysis. Check my blog for details & examples of using these functions. Handling dates is an important R skill - give it a go!

#R

Post: https://www.spsanderson.com/steveondata/posts/2024-02-06/

henrikbengtsson,
@henrikbengtsson@mastodon.social avatar

@stevensanderson "You will also notice that strftime() returns “52” for the last date of the year, while week() returns “53”. This is because week() follows the ISO 8601 standard for week numbers."

Did you swapped the two here? %V gives the ISO week, but week() doesn't. There's an isoweek() for that

henrikbengtsson, to random
@henrikbengtsson@mastodon.social avatar

A Friday afternoon thought:

No more manually tracking down what papers and sources to reference when running an analysis. Proposal:

refs <- gather_citations({
result <- some_pipeline_call()
})

This will:

  1. set an option telling packages and their functions to feel free to signal citation_condition:s
  2. have a calling handler gather all citation_condition:s and return them in 'refs'

Various utility functions can then operate on this, e.g.

unique(refs)
summary(refs)
toBibtex(refs)

henrikbengtsson, to random
@henrikbengtsson@mastodon.social avatar

First one for me; Someone had problems with parLapply() and asked about it on SO. One person suggested to try with lapply(). I added a follow-up to clarify that's a good approach + after getting it to work with lapply(), it's likely parLapply() will work. Pretty sure both of us wanted to help OP by giving them a tool for approaching these type of issues.

Then someone, not necessary OP, made this comment. Interesting take on our attempt to help. Bad day? Comment and whole question is now gone

henrikbengtsson, to macos
@henrikbengtsson@mastodon.social avatar

TIL (relearned?) that a non-privileged user can bind to "privileged" ports 1-1023 on both [A] and [B].

Here's an example on MS Windows 10:

> options(help.ports=80)
> port <- tools::startDynamicHelp(TRUE)
starting httpd help server ... done
> port
[1] 80
> help.start(browser = print)
[1] "http://127.0.0.1:80/doc/html/index.html"

On , ports 1-1023 are out of reach to non-privileged users.

[A] https://developer.apple.com/forums/thread/674179
[B] Need help to find official reference 🙏

henrikbengtsson, to random
@henrikbengtsson@mastodon.social avatar

Conda is neat and so, but oh-my, it wreaks havoc with users' Linux accounts, changing include and library paths, causing all type of conflicts (e.g. ld: cannot find -lxml2: No such file or directory)

Many users just find a 'conda install' command online and cut'n'paste it. Unless the user understands the error messages, they end up with lots of trial and errors, sometimes making things worse.

Conda is by far++ the most common cause of compilation and run-time errors I see on HPC environments

henrikbengtsson,
@henrikbengtsson@mastodon.social avatar

@rstub

At least, as a starter, it should be much harder for Conda to inject itself into to user's ~/.bashrc file. It should come with lots of warnings and are you really sure prompts. I'd probably advocate for an env var that sysadms can set to tell Conda to never ever do that. In Bash, the env var can be made read-only

henrikbengtsson, to random
@henrikbengtsson@mastodon.social avatar

@Posit , please consider adding:

for Shiny R, just like:

takes you to Shiny Python

henrikbengtsson, to random
@henrikbengtsson@mastodon.social avatar

Can't find that useful toot from a few weeks ago?

Mastodon community, please consider enabling the following opt-in setting:

  1. Go to Preferences
  2. Go to Public profile
  3. Go to Privacy and reach
  4. Check [x] 'Include public posts in search results'.

This will make your posts to become searchable not only for the hashtag, but also for free text search. I've enabled mine a few weeks ago.

kirill, to random

Single ' and double " quotes are similar in but very different in the (bash/zsh) shell. More often than not, I want the semantics of single quotes when I type a command. (Example: backticks are interpreted differently between double quotes.)

But the double quote is so deeply ingrained in my muscle memory 💪🏼 that I automatically type that instead. I want to fix that habit, without success so far.

Does this sound familiar? What has helped you in the past?

henrikbengtsson,
@henrikbengtsson@mastodon.social avatar

@kirill

(https://github.com/koalaman/shellcheck) is a fantastic and amazing tool that helps with this. It validates scripts, and it catches all the common mistakes and home-made misunderstandings we have around using . This alone has improved my understanding of Bash and I now feel like I'm no longer hacking together Bash code.

1/

henrikbengtsson,
@henrikbengtsson@mastodon.social avatar

@kirill

To get this instant feedback at the Bash prompt, we created ShellCheck-REPL (https://github.com/HenrikBengtsson/shellcheck-repl). It nudges you to write proper, safe Bash syntax at the CLI prompt too.

Say, if I enter:

$ echo 'Hello "${USER}"'
^---------------^ SC2016 (info): Expressions don't expand in single quotes, use double quotes for that.

it won't evaluate that, but instead have me fix issue SC2016 (https://www.shellcheck.net/wiki/SC2016). I can always override it by adding two spaces at the end.

2/

mmaechler, to random

r-project.org TLD needed registration renewal... and we found out considerably later than we should have, i.e. several hours after it stopped working ;-(( Finally solved by paying online w/ my own CC... -- hooray!

henrikbengtsson,
@henrikbengtsson@mastodon.social avatar

@hrbrmstr @mmaechler

Bob, are you referring to the risk that someone else could have taken over the domain? And if that happened, they could have started serving malicious versions of our CRAN packages via cran.r-project.org, which then would have been mirrored all over the world?

If so, then some important questions are: Were we ever open to this problem and for how long? Or was there a domain-change grace period that saved us? What can be done to avoid this risk in the future?

henrikbengtsson, to random
@henrikbengtsson@mastodon.social avatar

: Today Sept 19 @ 3pm:

We're meeting at a round table in Riverside (lunchroom downstairs; right & right after the escalators; you'll see us) to talk about processing in .

One topic is marshalling - figuring out how to send special, non-exportable objects to other R processes.

Looking forward to seeing you there

henrikbengtsson,
@henrikbengtsson@mastodon.social avatar
henrikbengtsson,
@henrikbengtsson@mastodon.social avatar

@Drmowinckels @jeremy_data oh no. Double bummer. Let's try to connect later. There's a non-zero chance we reconvene tomorrow too

henrikbengtsson, to random
@henrikbengtsson@mastodon.social avatar

If you're interested in discussing processing in at , please let me know or reply here. I'm hoping we can have an informal hangout during Monday, Tuesday, or Wednesday to discuss what's missing, what's new and what's on the roadmap for parallelization in R. It doesn't have to be on just

(I'll arrive late Sunday Sept 17 and leave early Thursday Sept 21)

smach, (edited ) to rstats
@smach@masto.machlis.com avatar

The {progressr} 📦 “provides a minimal API for reporting progress updates in R. . . .What type of progress to signal is controlled by the developer. How these progress updates are rendered is controlled by the end user. For instance, some users may prefer visual feedback such as a horizontal progress bar in the terminal, whereas others may prefer auditory feedback.”
By @henrikbengtsson
https://progressr.futureverse.org/
@rstats

henrikbengtsson,
@henrikbengtsson@mastodon.social avatar

@smach @rstats
Thanks for sharing. There's a typo: should be

ijlyttle, to random
@ijlyttle@vis.social avatar

I am really excited to see all the talks accepted for 🎉

Sadly for me, my proposal was not accepted - but I wanted to share the video with you all: https://www.youtube.com/watch?v=9FWjeomYphg

It's a testimonial to a project by @henrikbengtsson, "Wishlist for R": https://github.com/HenrikBengtsson/Wishlist-for-R

Henrik performs a big service for the R Community, and I'd like to say thanks!

henrikbengtsson,
@henrikbengtsson@mastodon.social avatar

@ijlyttle Wow, thanks for the kind words 🥰 I'm really glad to hear you find it this useful. It started out as a single README.md that several had commit rights to, but then ideas grew and we decided to simply use issues (kinda obvious in hindsight). I find it useful to scribble down detailed ideas there for others to find and to contribute to. Hopefully it matures over time and eventually can be proposed to R Core on BugZilla or R devel.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • megavids
  • thenastyranch
  • rosin
  • GTA5RPClips
  • osvaldo12
  • love
  • Youngstown
  • slotface
  • khanakhh
  • everett
  • kavyap
  • mdbf
  • DreamBathrooms
  • ngwrru68w68
  • provamag3
  • magazineikmin
  • InstantRegret
  • normalnudes
  • tacticalgear
  • cubers
  • ethstaker
  • modclub
  • cisconetworking
  • Durango
  • anitta
  • Leos
  • tester
  • JUstTest
  • All magazines