The latest Excel blunder from Austria is a lesson in why we need professional data people.
There's a widespread expectation that anyone can take Excel and use it to do critical things with data.
But data people know that we need the right tools to make data processing verifiable (usually with code) and enable us to check that everything is as it should be (with unit tests or assertions).
And most of all, we need more #dataliteracy at every level.
Having learned programming mostly in #rstats, I realize that I have a very fuzzy mental model of what "compiling" code even means. Can someone point me to an explanation of what it means to "compile" or "build from source" for semi- or non-experts?
Is it weird that one of the main sources of minor friction I experience when experimenting in #python coming from #rstats is the way it prints/formats objects at the command line. I’m sure some of it familiarity, but I find things slower to parse, visually.
New releases of vetiver 🏺 are out, for both #rstats and #python, and I'm excited to outline some of the new features, including support for deploying a vetiver model to AWS SageMaker:
So, really cool new tool I’ve learning as I’m working with lisp for my startup: symex.el for structural navigation and editing is VERY efficient, somewhat vim-based, and I really like it a lot. As an added bonus, some of its dependencies (e.g. lispy) are very useful for multi-lining s-exps (aka symexs) and formatting them, and it plays nicely with sly, which is even better!
Additionally, for all you #rstats people out there (who are asking why I included the tag on a post about Lisp), maybe take a quick look at this. I’m currently using it myself, and I’ve found it’s pretty good for most basic things, and you may like it if you give it a try (or maybe not, it’s not yet as full featured as R and its various packages yet, but it does benefit from some things I don’t think you can get easily from R). Also, here’s a super quick demo thingy(?).
I know other tools leveraging tree-sitter try to achieve similar functionality, but when the code is already in an AST format, it really eliminates the guesswork and makes the experience seamless!
I still haven’t seen anything to disprove the best description of R vs Python for data/stats that I’ve seen:
Python is an elegant, well-designed language with a confusing, oddly designed data DSL bolted onto it & R is an elegant, well designed data DSL with a confusing, oddly designed programming language built around it. #rstats
Is there a web API that you'd love to use with R, if only it weren't so painful? Or perhaps you're using one, but aren't sure if it would make sense as a package. Please let me know here! https://forms.gle/CJz12TzzHkGsnQma9
I'm spoilt by #rstats#shiny making simple things simple. Having to add a dropdown to filter a table in Excel requires a hidden tab and some VBA code. Why, oh, why, can't I just use UNIQUE() in the dropdown???
I update some data repos when I run a daily #rstats analysis, so I thought I'd share I have a GitHub credential manager installed on my laptop so I don't need to enter passwords, and after the script works out the data needs updating I run a system("cd '/pathTo/data/';git add * ;git commit -m 'data update';git push") command
Glossary is a lightweight solution for making glossaries in educational materials written in quarto or R Markdown. This package provides functions to link terms in text to their definitions in an external glossary file, as well as create a glossary table of all linked terms at the end of a section.
The {paws} #rstats 📦 helps you access more than 150 AWS services in R, including
Machine Learning
Translation
Natural Language Processing
Databases
File Storage
By Dyfan Jones & others https://paws-r.github.io/ #AWS@rstats
I am increasingly being asked to build structural equation models (in #Rstats) and am looking for recs about a textbook for someone who is familiar with linear regression but does not have background in IRT. My director says things like, "constrain the variances to 1" and I'm like, sure...but why?
R libraries implement performance-critical code in C++. But memory bugs in C++ code crash R, such as in this screenshot, even though R is designed to be memory safe.
fixest is an excellent library but a semi-frequent offender. And I've had this happen with other libraries too.
As someone who is using R because I am not prepared to debug C++, this can pretty much break a library for me. I hope that eventually Rust can take over C++'s role in #rstats
Hey #rstats people! I have a dataset where I’ve asked people to rank a set of 10 items from 1 to 10, and I want to compare those rankings between two groups. Sounds super simple, but I can’t work out what #statistic to use! I’ve found the nParLD package, but that’s not quite right as it’s not longitudinal data. Help?