@smach@masto.machlis.com
@smach@masto.machlis.com avatar

smach

@smach@masto.machlis.com

Director, editorial data & analytics at Foundry (an IDG company). Author of Practical R for Mass Communication & Journalism. A bit obsessed with both #RStats and #genAI. Learning #Python.

Write about R & #GenAI for #InfoWorld. She.

First joined Mastodon (at Fosstodon) on Oct 27, 2022.

Website: https://www.machlis.com
Apps: https://apps.machlis.com

Other interests: #Photography #DigitialDarkroom #Running #Bicycling #Crochet
Learning #ASL

Was @sharon000 on Twitter (no longer active)

This profile is from a federated server and may be incomplete. Browse more on the original instance.

smach, to rstats
@smach@masto.machlis.com avatar

Not traveling to Austria for next month's useR! conference? There's also a free virtual pre-conference on July 2! Sessions range from 4:30 to 21:30 CEST, so there's something for people in every time zone.
Topics include Redefining Interactive Data with Quarto and WebR, Stop Making Spaghetti (Code), Rix: Reproducible Environments with Nix, and more. Plus tutorials like Deploy and Monitor ML Pipelines with Open Source and Free Applications

https://events.linuxfoundation.org/user/program/virtual-schedule/

@rstats

smach, to USpolitics
@smach@masto.machlis.com avatar

Attention US voters concerned about the climate crisis:
https://www.threads.net/@aaron.rupar/post/C7tx6nIgtjB

smach, to climate
@smach@masto.machlis.com avatar

threat not often discussed:
“Pound for pound, gallon for gallon, hour-for-hour, the two-stroke gas powered engines in leaf blowers and similar equipment are vastly the dirtiest and most polluting kind of machinery still in legal use.

“According to the California Air Resources Board (CARB), the two-stroke leaf blowers and similar equipment in the state produce more ozone pollution than all of California’s tens of millions of cars, combined.”

  • James Fallows

https://fallows.substack.com/p/gas-powered-leaf-blowers-the-end

smach, to random
@smach@masto.machlis.com avatar

I cannot overstress how envious I am of democracies that have election campaigns of 6 weeks instead of 2 years

smach, to random
@smach@masto.machlis.com avatar

What happens when you mandate masks at a conference now that most people no longer wear them but medically vulnerable people are still at risk because ? In the case of , the conference sells out.

Why a masking policy? “Many of us and our fellow community members can’t attend without health and safety guidelines in place. We want PyCon US to be an event that everyone feels safe attending,” organizers explained.

Well done @pycon https://fosstodon.org/@pycon/112445571644276379

smach, to llm
@smach@masto.machlis.com avatar

New free course on #LLM agents from DeepLearning AI and crewAI:

“With crewAI, an open source library for building multi-agent systems, you'll get hands-on experience building agent crews for processes like:

💻 Tailoring resumes and interview prep for job applications
💻 Researching, writing, and editing technical articles
💻 Conducting customer outreach campaigns
💻 Financial analysis
💻 Planning events”

Taught by crewAI founder João (Joe) Moura

https://www.deeplearning.ai/short-courses/multi-ai-agent-systems-with-crewai/
#GenAI #AI

smach, to LLMs
@smach@masto.machlis.com avatar

“The general problem of mixing data with commands is at the root of many of our computer security vulnerabilities.” Great explainer by security researcher Bruce Schneier on why large language models may not be a great choice for tasks like processing your emails.
https://cacm.acm.org/opinion/llms-data-control-path-insecurity/

smach, to python
@smach@masto.machlis.com avatar

Ari Lamstein says his "Visualizing the Impact of Covid-19 on US Counties" blog post may be of interest if you want to "learn how to build data apps in Python, as the entire project is released under a permissive license (MIT), and is publicly available on GitHub."
Post: https://arilamstein.com/blog/2024/05/04/visualizing-the-impact-of-covid-19-on-us-counties/
Streamlit app: https://census-explorer.streamlit.app/
GitHub repo: https://github.com/arilamstein/censusdis-streamlit/tree/main

@python

smach, to LLMs
@smach@masto.machlis.com avatar

The TinyChart-3B LLM answers questions about data visualizations. It can also generate underlying data from a dataviz and Python code to re-create a similar chart.

Demo on Hugging Face: https://huggingface.co/spaces/mPLUG/TinyChart-3B

Code: https://github.com/X-PLUG/mPLUG-DocOwl/tree/main/TinyChart

Paper: https://arxiv.org/abs/2404.16635 8 authors from the Alibaba Group and Renmin University of China

smach, to rstats
@smach@masto.machlis.com avatar

{tidycensus} 📦 creator Kyle Walker: "Want all 8.13 million US Census blocks available for your project? It's a one-liner in thanks to the tigris and purrr packages:

us_blocks <- purrr::map_dfr(c(https://t.co/RfFgUSx1a6, "DC"), ~tigris::blocks(state = .x, year = 2023))

Downloading will take time; set options(tigris_use_cache = TRUE) beforehand to build a local cache of block shapefiles that you can access without having to download."

@rstats

smach, to rstats
@smach@masto.machlis.com avatar

The {summarytools} 📦 aims to:
“Provide a coherent set of easy-to-use descriptive functions [like] those in commercial statistical software suites such as SAS, SPSS, and Stata
“Offer flexibility in terms of output format & content
“Integrate well with commonly used software & tools for reporting”
Results can be displayed in console or rendered/saved as HTML, plain text, or R Markdown. By Dominic Comtois
https://htmlpreview.github.io/?https://github.com/dcomtois/summarytools/blob/master/doc/introduction.html
@rstats

smach, to rstats
@smach@masto.machlis.com avatar

File import/export in R is simple and elegant with the {rio} 📦. It uses just 2 main functions for dozens of file types: import() and export(). Whether .zip, .xlsx, Google sheets, json, .rds, .csv or more, rio handles file-extension checks and selecting the right functions.
http://gesistsa.github.io/rio/
There's also a convert() function.
One of my favorite R packages!
By Thomas J. Leeper, Chung-hong Chan, David Schoch & Jason Becker
@rstats

smach, to rstats
@smach@masto.machlis.com avatar

The {styler} 📦 “formats your code according to the tidyverse style guide (or your custom style guide) so you can direct your attention to the content of your code. It helps to keep the coding style consistent across projects and facilitate collaboration.” By Lorenz Walther & Kirill Müller

https://styler.r-lib.org/

@rstats

smach, to LLMs
@smach@masto.machlis.com avatar

“But this doesn’t save any time!” 3 useful questions when trying :

  • Is there another way to get results I want? Don't give up right away.
  • Does AI make this task less or more annoying? Sometimes supervising drudge work feels better even if it's not faster; other times you'd still rather do it yourself.
  • Are results likely to improve as LLMs get better? If so, add a calendar reminder to try again in a few months. Or, keep a list of things you want to re-try post GPT-5 class models.
smach, to ai
@smach@masto.machlis.com avatar

“Star Trek's Holodeck is no longer just science fiction. Using AI, engineers have created a tool that can generate 3D environments, prompted by everyday language.” This was designed to train robots, not entertain us humans. But Star Trek fans can easily envision other uses.

https://www.sciencedaily.com/releases/2024/04/240411130301.htm

smach, to random
@smach@masto.machlis.com avatar

From {tidycensus} creator Kyle Walker: “My webinar Analyzing 2020 Decennial US Census Data in is now on YouTube!
In the 3-hour webinar, you'll learn about:

📈 Available datasets in the 2020 US Census, and how to access and use them in R;
📈 How to explore decennial US Census data with tidyverse tools;
📈 Using interactive maps to explore US Census data;
📈 Advanced topics like working with detailed DHC-A data and analyzing change over time“ 1/2

https://youtu.be/JQRS5wYtPlY?si=ex4SHu7Xm3BmxdTM

smach, to random
@smach@masto.machlis.com avatar

I'm not sure why Claude's Haiku LLM started responding in Spanish to questions in English in a RAG application I'm building when all of the context was in English. Odd.

smach, to random
@smach@masto.machlis.com avatar

NASA has a useful interactive map with solar eclipse time & totality info by zip code if you’re in the US and looking for info

https://eclipse-explorer.smce.nasa.gov/

smach, to random
@smach@masto.machlis.com avatar

“You can't know the totality of whatever topic you're learning. You enjoy the knowledge you receive in the pursuit of learning the topic. You keep a steady pace toward learning as much as you can. Instead of lamenting that you didn't start sooner you're grateful that you started at all.”
-Craig Maloney, in The Mediocre Programmer
https://themediocreprogrammer.com/build/html/the_mediocre_programmer.html

Craig passed away Tuesday at the age of 52
https://www.legacy.com/us/obituaries/name/craig-maloney-obituary?id=54791928

smach, to ai
@smach@masto.machlis.com avatar

I doubt it's coincidence that “GPT-5 is on the way!” news cropped up after some key industry analysts praised Anthropic's Claude Opus as better than GPT-4. Large language models at this scale may be new, but tech vendor strategies are not.

smach, to rstats
@smach@masto.machlis.com avatar

The {tidyHeatmap} 📦 “introduces tidy principles to the creation of information-rich heatmaps.”
“For plotting, you simply pipe the input data frame into heatmap, specifying:
The rows, cols relative column names (mandatory)
The value column name (mandatory)
The annotations column name(s)”
By Stefano Mangiola
https://github.com/stemangiola/tidyHeatmap
@rstats

smach, to llm
@smach@masto.machlis.com avatar

Update: I paid $20 for a month of Claude Pro to use Anthropic's GPT-4-class model, uploaded City Council meeting minutes, and used a modified version of Anthropic's “Cite your sources” suggested prompt for my queries. Responses to questions about what issues specific City Councilors discussed at the meeting were surprisingly good - and included source references. Worth further investigation.

https://masto.machlis.com/@smach/112108143932301141

smach, to llm
@smach@masto.machlis.com avatar

The {tidychatmodels} 📦 “provides a simple interface to chat with your favorite AI chatbot from R. It is inspired by the modular nature of tidymodels where you can easily swap out any ML model for another one but keep the other parts of the workflow the same.” Current support for OpenAI, Mistral.ai, and Ollama. By Albert Rapp on GitHub
https://tidychatmodels.albert-rapp.de/
@rstats

smach, to random
@smach@masto.machlis.com avatar

Matt Stiles: “🚨 Goofy side project alert 🚨
“I have a growing list of scrapers that collect the geocoded locations of popular restaurants, stores and other spots [in the US]. So far I have 58 companies and ~119,000 locations.
“Grab data/code — and contribute! — here: https://github.com/stiles/locations/

smach, to rstats
@smach@masto.machlis.com avatar

I'm setting up a new more powerful Mac work system today, and it's very odd that my R sessions keep crashing when working with large data sets in RStudio. This didn't happen on the older, less capable Mac and it's not happening when I run the same scripts in VS Code.

I don't have time to trouble-shoot this, I'm just going to run those scripts in VS Code. But it's odd. Wonder if anyone else has experienced this and if there's a way to allocate more memory to
#RStudio
@Posit @rstats #rstats

  • All
  • Subscribed
  • Moderated
  • Favorites
  • Leos
  • thenastyranch
  • rosin
  • everett
  • cisconetworking
  • love
  • Youngstown
  • slotface
  • Durango
  • ngwrru68w68
  • kavyap
  • tacticalgear
  • DreamBathrooms
  • mdbf
  • megavids
  • magazineikmin
  • cubers
  • modclub
  • InstantRegret
  • ethstaker
  • osvaldo12
  • GTA5RPClips
  • khanakhh
  • anitta
  • provamag3
  • normalnudes
  • tester
  • JUstTest
  • All magazines