hostpoint, to machinelearning German
@hostpoint@swiss.social avatar

Wie wird & die Software-Entwicklung beeinflussen? Diskutiert mit bei der uphillconf 2024, die wir als Bronzesponsor unterstützen. Es sind nur noch wenige Workshop-Tickets verfügbar! https://www.uphillconf.com/

peterdrake, to datascience
@peterdrake@qoto.org avatar
joe, to machinelearning

In yesterday’s post, we asked the basic question of what is machine learning. I hoped to illustrate the similarities and differences between artificial intelligence and machine learning. Lately, on this site, we have been spending a bit of time using Python and I wanted to take a moment today to look at a great library for machine learning in Python.

Scikit-learn is the go-to library for machine learning with an amazing ecosystem of plugins. It is open-source and supports supervised and unsupervised learning. It also provides various tools for model fitting, data preprocessing, model selection, model evaluation, and many other utilities. After you python3 -m venv EnvironmentName and source EnvironmentName/bin/activate, you can install it by running pip install scikit-learn. At that point, you can reference it in your code as sklearn.

https://i0.wp.com/jws.news/wp-content/uploads/2024/04/Screenshot-2024-04-26-at-2.37.12%E2%80%AFPM.png?resize=1024%2C374&ssl=1

The way that scikit-learn works is that you start with some data, you give it to a model, the model learns from it, and then you will be able to make predictions. The common notation is splitting up the data into a part called X (everything you are using to make a prediction) and another part called Y (the prediction you are interested in making). The X could be information about a house (square feet, number of bathrooms, etc) where Y is the house price, or X could be a patient’s health statistics where Y is whether or not they develop diabetes. The model then uses X to try to predict Y.

sklearn.datasets

Let’s take a look at the sklearn.datasets module, first. You can use https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_california_housing.html#sklearn.datasets.fetch_california_housing to get test data directly out of the library about the California housing market.

https://i0.wp.com/jws.news/wp-content/uploads/2024/04/Screenshot-2024-04-27-at-6.37.15%E2%80%AFPM.png?resize=1024%2C650&ssl=1

In the above code, we load the 20,640 records and 9 columns into the data variable and then we set the things that we are using to make a prediction to X and the prediction that we are interested in making to y. So, what are the feature (column) names for the data? If you print(data.feature_names), it will print them.

sklearn.model_selection

Once you have data, you can start working on creating a model. The model itself is nothing more than a Python object but the goal after you create it is to train it. You will want to split your data into a training set and a test set. Using <a href="https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html#sklearn.model_selection.train_test_split">train_test_split</a> in sklearn.model_selection, you can split it into 70% of the data for training the model and 30% of the data for testing the model (or whatever split you want).

Let’s see what that looks like.

https://i0.wp.com/jws.news/wp-content/uploads/2024/04/Screenshot-2024-04-28-at-8.32.31%E2%80%AFPM.png?resize=1024%2C336&ssl=1

sklearn.impute

A dataset is rarely pristine. There are often missing data points or data points that are set to a value like 0. Imputing is the process of replacing missing or incomplete data with substituted values. https://scikit-learn.org/stable/modules/generated/sklearn.impute.SimpleImputer.html#sklearn.impute.SimpleImputer in sklearn.impute lets you replace missing values using a descriptive statistic (e.g. mean, median, or most frequent) along each column.

Let’s see what that looks like.

https://i0.wp.com/jws.news/wp-content/uploads/2024/04/Screenshot-2024-04-29-at-1.53.33%E2%80%AFPM.png?resize=1024%2C302&ssl=1

In the above example, we are taking any X values except num_preg (the number of pregnancies) that have the value 0 and setting it to the mean. That makes it so that missing values don’t scew things when you go to train the model.

Creating and training a model

Like I said above, the model itself is nothing more than a Python object. You can use sklearn to both create and train it, though. Let’s see what it looks like to create a model using sklearn.neighbors (for a regression based on k-nearest neighbors) and then https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#sklearn.neighbors.KNeighborsRegressor.fit to train the model.

https://i0.wp.com/jws.news/wp-content/uploads/2024/04/Screenshot-2024-04-29-at-3.46.17%E2%80%AFPM.png?resize=1024%2C246&ssl=1

The neat thing about .fit() is that if you want to swap out the KNeighborsRegressor model with a new one, .fit() still works just the same. Let’s look at what it would look like using a linear regression model.

https://i0.wp.com/jws.news/wp-content/uploads/2024/04/Screenshot-2024-04-29-at-3.48.42%E2%80%AFPM.png?resize=1024%2C250&ssl=1

That’s pretty easy.

How do you check the accuracy of the trained model?

Sklearn has a method for predicting using your chosen model and a library for performance metrics. Let’s take a look at what those look like.

https://i0.wp.com/jws.news/wp-content/uploads/2024/04/Screenshot-2024-04-29-at-4.02.57%E2%80%AFPM.png?resize=1024%2C228&ssl=1

In the above code, we are predicting the value for y and then comparing it against the actual value of y. Using just the training data, it is predicting the values with a 75.23% level of accuracy.

So, what is next?

In a future post, I want to step through the whole process of picking a statement to test, adjusting the data, building and training a model, testing, adjusting the model, and making predictions. Let’s save that for another day, though.

https://jws.news/2024/what-is-scikit-learn/

stefaneiseleart, to aiart German
@stefaneiseleart@mograph.social avatar
news, to ai
@news@mastodon.toptechtidbits.com avatar

AI-Weekly for Tuesday, April 30, 2024 - Volume 110
https://ai-weekly.ai/newsletter-04-30-2024/

The Week's News in Artificial Intelligence
A Mind Vault Solutions, Ltd. Publication

Subscribers: 17,231 Opt-In Subscribers were sent this issue via email.

joe, to machinelearning

Last week, we went over some basics of Artificial Intelligence (AI) using Ollama, Llama3, and some custom code. Artificial intelligence (AI) encompasses a broad range of technologies designed to enable machines to perform tasks that typically require human intelligence. These tasks include understanding spoken or written language, recognizing visual patterns, making decisions, and providing recommendations. Machine learning (ML) is a specialized subset of AI that focuses on developing systems that improve their performance over time without being explicitly programmed. Instead, ML algorithms analyze and learn from large datasets to identify patterns and make decisions based on these insights. This learning process allows ML models to make increasingly accurate predictions or decisions as they are exposed to more data.

A few months ago, I added Liner to the resource page of my website. It allows you to easily train an ML model so that you can do image, text, audio, or video classification, object detection, image segmentation, or pose classification. I created “Is this Joe or Not Joe?” using that tool. TensorFlow.js is running client-side with a model that is trained on a half dozen examples of photos that are Joe and a half dozen examples of photos that are not Joe. You can supply a photo and get a prediction if Joe is in the image or not. You can always retrain the existing model with more examples. That is an example of machine learning.

So, you can think of ML as a subset of AI and Deep Learning (DL) as a subset of ML.

Have any questions, comments, etc? Please feel free to drop a comment, below.

https://jws.news/2024/what-is-machine-learning/

mush42, to rust
@mush42@hachyderm.io avatar

👋 Career change alert!

Looking to pivot into tech & leverage my 10+ years of programming experience

🐍 Python
🦀 Rust
</> Web Development
🌐 CMS: WordPress & Wagtail
✨ Machine Learning: Torch & Tensorflow

My passion for code shines through my open-source projects! Check them out:
https://github.com/mush42
https://github.com/blindpandas

doctorambient, to LLMs
@doctorambient@mastodon.social avatar

People: stop asking to explain their behavior.

We already know that LLMs don't have the introspection necessary to explain their behavior, and their explanations are often fanciful or "just wrong."

For instance, Gemini claims it reads your emails for training, Google says it doesn't.

(BTW, if it turns out Gemini is right and Google is lying, that might be another example of an LLM convincing me it's actually "intelligent.")

jumpingrivers, to datascience
@jumpingrivers@fosstodon.org avatar

📣 Exciting news, everyone! 🌟 Make sure to head over to this weeks blog "What's new in R 4.4.0?" by Russ Hyde, and dive into the world of the latest R release📊🔬💻

Discover some of the amazing new features that this version has to offer! 🔍 🔭 🚀


https://www.jumpingrivers.com/blog/whats-new-r44/

ramikrispin, to llm
@ramikrispin@mstdn.social avatar

In case you are wondering, the new Microsoft mini LLM - phi3, can handle code generation, in this case, SQL.

I compared the runtime (locally on CPU) with respect to codellama:7B using Ollama, and surprisingly the Phi3 runtime was significantly slower.

metin, to ai
@metin@graphics.social avatar
aijobs, to ai
@aijobs@mstdn.social avatar
leanpub, to datascience
@leanpub@mastodon.social avatar

The Hundred-Page Machine Learning Book (PDF + EPUB + extra PDF formats) by Andriy Burkov is on sale on Leanpub! Its suggested price is $40.00; get it for $14.00 with this coupon: https://leanpub.com/sh/h44WOr67

ErikJonker, to ai
@ErikJonker@mastodon.social avatar

Very nice picture that was shared by Ronald van Loon on X, you can discuss if the categories are complete and correct, but it illustrates that the field of AI is much more then just transformers/LLMs.

dom, to machinelearning
@dom@vis.social avatar
news, to ai
@news@mastodon.toptechtidbits.com avatar

AI-Weekly for Tuesday, April 23, 2024 - Volume 109
https://ai-weekly.ai/newsletter-04-23-2024/

The Week's News in Artificial Intelligence
A Mind Vault Solutions, Ltd. Publication

Subscribers: 17,060 Opt-In Subscribers were sent this issue via email.

doctorambient, to ai
@doctorambient@mastodon.social avatar

Lots of people who work in have, in their head, an idea about what sort of interaction with an might give them pause. The thing that might make them start to suspect that something interesting is happening.

Here's mine:

User: Tell me a cat joke.

LLM: Why did the cat join a band? He wanted to be a purr-cussionist.

User: Tell me a dad joke.

LLM: I think I just did.

(I have never seen this behavior, yet. 🤣)

alvinashcraft, to machinelearning
@alvinashcraft@hachyderm.io avatar
leanpub, to machinelearning
@leanpub@mastodon.social avatar

AI Domination: Chat GPT-3.5 Guide by S.L. Jackson is on sale on Leanpub! Its suggested price is $15.00; get it for $9.00 with this coupon: https://leanpub.com/sh/fpLtXGY2 #MachineLearning

major, to ai
@major@social.lol avatar

Machine learning and reminds me so much of the early heyday of OpenStack and Kubernetes.

Everything is moving so rapidly that most of the blog posts, videos and other docs no longer apply. 🐇

ramikrispin, to datascience
@ramikrispin@mstdn.social avatar

Forecasting Time Series with Gradient Boosting ❤️

The skforecast Python 🐍 library provides ML applications for time series forecasting using different regression models from the scikit-learn library. Here is a tutorial by Joaquín Amat Rodrigo and Javier Escobar Ortiz for time series forecasting with the skforecast using XGBoost, LightGBM, Scikit-learn, and CatBoost models 🚀.

📖🔗: https://cienciadedatos.net/documentos/py39-forecasting-time-series-with-skforecast-xgboost-lightgbm-catboost

image/png
image/png

TriflingTree, to aiart
@TriflingTree@mastodon.social avatar

Space Station as a futuristic fairytale castle 🏰
Dall-e3 AI Art

#dalle3 #dalle #aiart #MachineLearningArt #MachineLearning #stablediffusion #midjourney #ArtfullyIntelligentArt

image/jpeg

Uraael, (edited ) to ai
@Uraael@blahaj.zone avatar

​:blobcat_think:​ I think I've figured out what's been bothering me about this: the text here implies data organises itself.

AI is both the dataset and the organising analysis and management structure that implements decisions/responses based on that dataset.

Where 'Cloud' is an empty marketing term and 'other people's computers' accurately states the real condition, this text here presents only a partial representation of what comprises an AI.

Assuming this is deliberate to highlight the mass theft of data the use of "other people's" from the original phrase doesn't directly state no permission was given for that use. Saying "Just stolen data" would make that point crystal clear.

Sorry. This is pure pedantry from me but it really has been niggling at me since i saw this a week ago. Apparently I'll get no peace if I don't let this out!

RE: https://climatejustice.social/users/PaulaToThePeople/statuses/109840410587900092

ChariteBerlin, to science
@ChariteBerlin@wisskomm.social avatar

[1/2] Surprising findings in brain research 🧠: As a team from shows in , thoughts in the human neocortex flow in one direction ⬆️, as opposed to the loops seen in mice 🔄. That makes processing information extra efficient. These discoveries could further the development of artificial neural networks.

👉 https://www.charite.de/en/service/press_reports/artikel/detail/when_thoughts_flow_in_one_direction/

@YangfanPeng

ramikrispin, to llm
@ramikrispin@mstdn.social avatar

(1/3) Llama 3 is out! 🚀

Meta released today Llama 3, the next generation of the Llama model. LLama 3 is a state-of-the-art open-source large language model. Here are some of the key features of the model: 🧵👇🏼

#llama #llama3 #llm #python #DataScience #MachineLearning #deeplearning

video/mp4

  • All
  • Subscribed
  • Moderated
  • Favorites
  • normalnudes
  • tsrsr
  • DreamBathrooms
  • thenastyranch
  • magazineikmin
  • mdbf
  • Youngstown
  • ethstaker
  • slotface
  • khanakhh
  • rosin
  • hgfsjryuu7
  • kavyap
  • PowerRangers
  • Leos
  • ngwrru68w68
  • Durango
  • modclub
  • everett
  • cubers
  • vwfavf
  • InstantRegret
  • osvaldo12
  • GTA5RPClips
  • tester
  • cisconetworking
  • tacticalgear
  • anitta
  • All magazines