The paper with that figure got accepted by a journal requiring a CCBy4 license.
What's the license of figures that we produce by using that R package? The code of the package is released under GPL3, but not clear to me if that applies also to the figures produced by me with that package.
or how to run an #RStats {targets} pipeline on Github Actions using #Nix
Thanks to @determinatesystems for developing the nix-installer-action and the magic-nix-cache-action to make setting up #Nix super easy on Github Actions and thanks to @landau for {targets}!
I somehow missed that Red Hat abruptly ended Operate First - the #datascience stuff in OpenShift. I had been evaluating it earlier this year in my day job and it looks like I may have definitely dodged a bullet by sticking with upstream Jupyter.
This tutorial from @gvwilson is great. "SQL for data scientists in 100 queries" builds up complexity adding something new in every query. From "SELECT * FROM table" to partitioned window functions and much more.
(1/3) I am excited to share that my course - Data Pipeline Automation with GitHub Actions Using R and Python 🚀, is now available on LinkedIn Learning!
The course provides an introduction to setting up automation with GitHub Actions with both R and Python. Throughout the course, we will use a real-life example by working with the U.S. Energy Information Administration (EIA) API for data automation. 🧵👇🏼
(1/6)This time of the year ☃️...Statistical Rethinking 2024 ❤️❤️❤️
This has become a tradition. Like previous Decembers, this week, the 2024 edition of the Statistical Rethinking course was announced. If you are looking to learn Bayesian statistics, I highly recommend checking it out.
Re-generating input datasets for our network merging methods + reproducible #DataScience + geocomputation + visualisation for sustainable transport planning paper. #workinprogress 🏗️
If anyone knows of any good examples of 'braiding' like this let us know!
We just released data on crown maps for 100 million trees in the National Ecological Observatory Network (NEON) with information on location, species identify, size, and alive/dead status.
We created this dataset by combining deep learning, remote sensing, and extensive field data to build models that can detect and classify individual trees. Models are almost 80% accurate and can be improved with additional field data collection.
The Machine Learning with Graphs course by Prof. 𝐉𝐮𝐫𝐞 𝐋𝐞𝐬𝐤𝐨𝐯𝐞𝐜 from Stanford University (CS224W) focuses on different methods for analyzing massive graphs and complex networks and extracting insights using machine learning models and data mining techniques. 🧵🧶👇🏼
I love education, and have worked in #EdTech for longer than I've been involved in #DataScience (and long before I heard of #RStats), but I don't have enough experience running workshops. If you're running an online #RStats workshop and could use a TA, I'm available! I read all the things (books/blogs/social media posts/learner questions/github repos/code) so I'm ready to answer questions, but I need more experience with the format!
Please pass this along to your networks!
(1/3) Last Friday, I was planning to watch Masters of the Air ✈️, but my ADHD had different plans 🙃, and I ended up running a short POC and creating a tutorial for getting started with Ollama Python 🚀. The settings are available for both Docker 🐳 and locally.
TLDR: It is straightforward to run LLM models locally with the Ollama Python library. Models with up to ~7B parameters run smoothly with low compute resources.
If you are using Polars, I recommend you check the Awesome Polars repo. The Awesome Polars repo provides a curated list of Polars talks, tools, examples & articles.