@ramikrispin@mstdn.social
@ramikrispin@mstdn.social avatar

ramikrispin

@ramikrispin@mstdn.social

Data science and engineering senior manager at ๏ฃฟ | #rstats & #Python | ๐Ÿ“ฆ dev | โค๏ธ time-series analysis & forecasting | Author. Opinions are my own | https://linktr.ee/ramikrispin

This profile is from a federated server and may be incomplete. Browse more on the original instance.

ramikrispin, to random
@ramikrispin@mstdn.social avatar

Looking forward to the WWDC 2024 opening keynote โค๏ธ

๐Ÿ“† When: Today, 10 AM PST

๐Ÿ“ฝ๏ธ Live stream: https://www.apple.com/apple-events/

#WWDC

video/mp4

ramikrispin, to datascience
@ramikrispin@mstdn.social avatar

Building a GPT-2 from scratch ๐Ÿš€

Andrej Karpathy released today a tutorial for reproducing GPT-2 from scratch. OpenAI released GPT -2 in 2019, and it is a 124M parameters model. This four-hour tutorial covers setting up the GTP-2 network and then training and optimizing its parameters.

It looks like a really cool tutorial; I hope to get the bandwidth to watch it in the coming weeks!

๐Ÿ“ฝ๏ธ https://www.youtube.com/watch?v=l8pRSuU81PU

ramikrispin, to datascience
@ramikrispin@mstdn.social avatar

DevOps for Data Science - New Book ๐Ÿš€

Always happy to see new MLOps books! The DevOps for Data Science is a new book by Alex K Gold. As the name implies, the book focuses on topics related to DevOps for data scientists. This includes the following:
โœ… Command line
โœ… Working with Linux systems
โœ… Docker
โœ… Scaling resources
โœ… Network, domains, DNS, SSL, etc.
โœ… Authentication

#DataScience #mlops #devops

ramikrispin, to vscode
@ramikrispin@mstdn.social avatar

This weekend, working on an R + Docker ๐Ÿณ workshop for the R/Medicine conference. The workshop focuses on setting up a dockerized R development environment using and the Dev Containers extension ๐Ÿš€. The workshop is based on a tutorial I created last year for setting up a dockerized R development environment with VScode, and the Dev Containers extension. It is also a great opportunity to test the new Shiny extension for VScode by Posit, which was released recently ๐Ÿ˜Ž.

ramikrispin, to python
@ramikrispin@mstdn.social avatar

(1/5) ๐‡๐š๐ฉ๐ฉ๐ฒ ๐’๐š๐ญ๐ฎ๐ซ๐๐š๐ฒ! โ˜€๏ธ
Here are a few steps you can take to reduce your Python ๐Ÿ image size ๐Ÿ‘‡๐Ÿผ

TLDR - Using slim image and multi-stage build

ramikrispin,
@ramikrispin@mstdn.social avatar

(2/5) ๐’๐ฅ๐ข๐ฆ ๐ข๐ฆ๐š๐ ๐ž
Typically, I use the Python official image as the baseline for setting up a dockerized Python environment. The official Python image offers multiple images for different Linux flavors and CPU architectures. The default image (e.g., ๐˜ฑ๐˜บ๐˜ต๐˜ฉ๐˜ฐ๐˜ฏ:๐˜ญ๐˜ข๐˜ต๐˜ฆ๐˜ด๐˜ต) has comprehensive supporting tools that impact the image size - 1 GB.

A simple way to reduce the image size is to replace the default image with a slim version that is 150 MB (compared to 1GB) ๐Ÿš€.

ramikrispin, to python
@ramikrispin@mstdn.social avatar

Posit recently released a new Shiny extension for VScode, supporting both Shiny for R and Python ๐Ÿš€

More details on the release post ๐Ÿ‘‡๐Ÿผ
https://shiny.posit.co/blog/posts/shiny-vscode-1.0.0/

Extension ๐Ÿ”—: https://marketplace.visualstudio.com/items?itemName=Posit.shiny

ramikrispin, to machinelearning
@ramikrispin@mstdn.social avatar

(1/2) I am excited to present at the useR!2024 conference on July 2nd!

I am going to run a virtual workshop about deployment and monitoring data and ML pipelines using free and open-source tools. This includes setting pipelines using GitHub Actions, Docker ๐Ÿณ, R, and Quarto ๐Ÿš€.

When ๐Ÿ“†: July 2nd at 10 AM PST

ramikrispin,
@ramikrispin@mstdn.social avatar

(2/2) The event is virtual and open. More details and to register in the link below (search for the event) ๐Ÿ‘‡๐Ÿผ

https://events.linuxfoundation.org/user/program/virtual-schedule/

Thanks to the conference organizers for the invite!

ramikrispin, to ArtificialIntelligence
@ramikrispin@mstdn.social avatar

(1/2) Congratulations to my friend Lior and his co-author Meysam for the release of their new book - Mastering NLP from Foundations to LLMs ๐ŸŽ‰

I met Lior a few years ago at a conference, and since then, I have been following his work in the field of NLP โค๏ธ.

ramikrispin,
@ramikrispin@mstdn.social avatar

(2/2) The book covers the following topics:
โœ… Mathematical foundations of machine learning and NLP
โœ… Data preprocessing techniques for text data
โœ… Machine learning applications for NLP and text classification
โœ… Deep learning methods for NLP and text applications
โœ… Theory and design of Large Language Models
โœ… Applications of LLM models
โœ… LLM applications with Langchain

The book is for folks who are interested in getting started with NLP and those who wish to delve into LLM applications.

ramikrispin, to opensource
@ramikrispin@mstdn.social avatar

I am excited to present at the Dev AI conference in Paris on June 19!

I am going to run a workshop about the deployment and monitoring of ML pipelines with free and open-source tools. This includes using tools such as GitHub Actions and Pages, Docker, Python, Quarto, etc.

More details are available on the conference website๐Ÿ‘‡๐Ÿผ
https://events.linuxfoundation.org/ai-dev-europe/

Thanks to the Linux Foundation and the conference organizers for the invite!

ramikrispin, to datascience
@ramikrispin@mstdn.social avatar

DuckDB can now read data from Hugging Face via the hf:// prefix ๐Ÿ‘‡๐Ÿผ

https://duckdb.org/2024/05/29/access-150k-plus-datasets-from-hugging-face-with-duckdb

#data #duckdb #DataScience #huggingface

ramikrispin, to random
@ramikrispin@mstdn.social avatar

This moment, it all goes green ๐Ÿ˜Ž

ramikrispin, to datascience
@ramikrispin@mstdn.social avatar

Gradient Descent Visualization ๐Ÿ‘‡๐Ÿผ

I was looking for examples of interactive data visualization for a gradient descent algorithm, and I found this app by Lili Jiang. This desktop app is based on C++ and enables simulation and visualization of different gradient descent algorithms, such as momentum, AdaGrad, RMSProp, and Adam. The app enables to compare different methods simultaneously.

https://github.com/lilipads/gradient_descent_viz

Image credit: App repository

#DataScience #MachineLearning

video/mp4

ramikrispin, to python
@ramikrispin@mstdn.social avatar

Open your calendar, NumPy 2.0 is going to be out on June 16th ๐Ÿš€

This is the first major release since 2006. The release includes breaking changes in the library API, and therefore, if you are planing to adopt it, some code refactoring may required.

The release includes new features, performance improvement ๐ŸŽ๏ธ, improvements on the C API, and more.

More details are available on the release notes: https://numpy.org/devdocs/release/2.0.0-notes.html

ramikrispin, to datascience
@ramikrispin@mstdn.social avatar

(1/2) Shiny Apps for demystifying statistical models and methods ๐Ÿš€

This is a cool website that explains different statistical concepts with the use of interactive Shiny Apps. Ben Prytherch made this website from the Department of Statistics at Colorado State University.

#DataScience #Stats #statistics #MachineLearning #RStats

video/mp4

ramikrispin,
@ramikrispin@mstdn.social avatar

(2/2) It covers the following topics:
โœ… Factorial ANOVA
โœ… Mixed effect ANOVA
โœ… Mixed effect with random slopes
โœ… Logistic regression
โœ… ANCOVA
โœ… One-way ANOVA
โœ… Odds ratio vs relative risk
โœ… Correlation coefficient vs slope
โœ… Sampling

Great use case of Shiny apps ๐Ÿ‘‡๐Ÿผ
https://sites.google.com/view/ben-prytherch-shiny-apps/shiny-apps

ramikrispin, to datascience
@ramikrispin@mstdn.social avatar

Building robust data pipelines with dbt, Airflow, and Great Expectations ๐Ÿš€

I started to dive into great expectations - a Python library for data quality checks, and I found this great talk by Sam Bail about building data pipelines with dbt, Airflow, and great expectations.

๐Ÿ“ฝ๏ธ https://www.youtube.com/watch?v=yJFHgNWmoMg

ramikrispin, to python
@ramikrispin@mstdn.social avatar

Cohort Revenue & Retention Analysis with Python ๐Ÿš€

For those who work with cohort data, I recommend checking Dr.Juan Orduz tutorial for cohort revenue and retention analysis with PyMC ๐Ÿ‘‡๐Ÿผ

https://www.pymc-labs.com/blog-posts/cohort-revenue-retention/

#python #DataScience #Bayesian #pymc

image/png
image/png

ramikrispin, to machinelearning
@ramikrispin@mstdn.social avatar

Machine Learning for Beginners ๐Ÿš€

The Machine Learning for Beginners by Microsoft Developer is an introductory course for classical machine learning. This crash course mainly focuses on regression analysis with Python ๐Ÿ, and it covers topics such as:
โœ… General setup
โœ… Cleaning data
โœ… Data visualization
โœ… Regression models
โœ… Polynomial regression
โœ… Logistic regression

๐Ÿ“ฝ๏ธ https://www.youtube.com/playlist?list=PLlrxD0HtieHjNnGcZ1TWzPjKYWgfXSiWG

ramikrispin, to python
@ramikrispin@mstdn.social avatar

Happy Friday! โ˜€๏ธ

Scientific Python Lectures ๐Ÿš€

Here is a short e-book with a sequence of tutorials on the scientific Python ecosystem for beginners. This includes topics such as:
โœ… Working with numerical data using NumPy
โœ… Data visualization with Matplotlib
โœ… Scientific computing with SciPy
โœ… Statistics with Python
โœ… Machine learning with scikit-learn

https://lectures.scientific-python.org

Thanks to the tutorial contributors!

#python #DataScience #MachineLearning

image/png
image/png

ramikrispin, to random
@ramikrispin@mstdn.social avatar

(3/4) Installation: ๐˜ฑ๐˜ช๐˜ฑ ๐˜ช๐˜ฏ๐˜ด๐˜ต๐˜ข๐˜ญ๐˜ญ ๐˜ฑ๐˜ญ๐˜ฐ๐˜ต๐˜ฏ๐˜ช๐˜ฏ๐˜ฆ

The library has additional extensions that you can install to extend its functionality.

License: MIT ๐Ÿฆ„

Plotnine contest: https://posit.co/blog/announcing-the-2024-plotnine-contest/

ramikrispin,
@ramikrispin@mstdn.social avatar

(4/4) Resources ๐Ÿ“š
Plotnine contest: https://posit.co/blog/announcing-the-2024-plotnine-contest/
Source code: https://github.com/has2k1/plotnine
Documentation: https://plotnine.org

ramikrispin, to python
@ramikrispin@mstdn.social avatar

(1/4) TIL about the plotnine library- the grammar of graphics in Python ๐Ÿš€

I had never heard about the Plotnine library until I came across the Posit Plotnine contest (see the link below). The plotnine is a Python implementation of a grammar of graphics based on the ggplot2 library.

#python #rstats #dataviz #DataScience #opensource

image/png
image/png

ramikrispin,
@ramikrispin@mstdn.social avatar

@transportationtalk nice! it is also intercative!

  • All
  • Subscribed
  • Moderated
  • Favorites
  • โ€ข
  • provamag3
  • InstantRegret
  • mdbf
  • ethstaker
  • magazineikmin
  • GTA5RPClips
  • rosin
  • thenastyranch
  • Youngstown
  • osvaldo12
  • slotface
  • khanakhh
  • kavyap
  • DreamBathrooms
  • JUstTest
  • Durango
  • everett
  • cisconetworking
  • Leos
  • normalnudes
  • cubers
  • modclub
  • ngwrru68w68
  • tacticalgear
  • megavids
  • anitta
  • tester
  • lostlight
  • All magazines