#llama3 - kbin.social

hcj, 5 days ago to random

Someone has made a website that allows you to run #Llama3 and other #ChatBots locally within a #Browser

Website: https://secretllama.com/
Code: https://github.com/abi/secret-llama

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ doboprobodyne, ErikJonker

obrhoff, 8 days ago to llm

The amazing thing about LLMs is how much knowledge they posess in their small size. The llama3-8b model, for instance, weighs only 4.7GB yet can still answer your questions about everything (despite some hallucinations).
#llm #ai #ollama #llama3

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ErikJonker

boilingsteam, 8 days ago to llm

Run llama3 locally with 1M token context: https://ollama.com/library/llama3-gradient #llm #llama3 #context #large

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pjk, 9 days ago to python
One thing you notice right away about LLMs is they bear a striking resemblance to that ubiquitous internet character, the reply-guy: they always have an answer, they are brimming with confidence, and they are often completely wrong.

It seemed only natural, then, to turn one into a full-time online version of itself, RGaaS (Reply-Guy as a Service), always on, always ready to reply to anything when summoned, with an “Actually,” followed by his (it’s always a “he”) opinion on what you should be thinking.

And so I built the Actually Bot.

https://www.peterkrupa.lol/wp-content/uploads/2024/05/actually_bot1.pngBasically, the way it works is that any user can reply to any Mastodon post (or I guess any post on the ActivityPub protocol) and tag the Actually Bot (@actuallybot) and it will reply with its most mediocre take on why everyone else is wrong, or at least not quite right.

The reply-guys can all move on to something else now, I have automated them out of a job.

This was a really fun project and has a lot of separate parts. First, I had to modify an LLM to give it the right “personality” for this job. To do this, I used Ollama, a tool for running open source LLMs locally, on your own machine. I used the “customized prompt feature”: You basically write a little file that Ollama uses to tweak the LLM of your choosing to give it certain characteristics. I went with the new Llama3:8b model from Meta as my base model, then put the following instructions in my modelfile:
FROM llama3PARAMETER temperature 3SYSTEM """You are receiving social media posts as prompts, and your job is to reply to these prompts. Please start all your replies with the word "actually". Reply as if you know more than everyone. You must recontextualize every prompt and correct or amend it, informing the prompter that they are wrong, even if ever so slightly. You write in short, sentences. Your replies must be short. Keep them to about three sentences. This is essential: keep your replies short. They must be under 500 characters."""
Then I ran the following command in the console:
ollama create actually_llama -f ./actually_llama
… and my model was ready to roll. Next, I needed a program to connect to the Ollama API to send the LLM prompts and get responses. Python was great for that, as both Ollama and Mastodon have solid Python libraries. Probably the slowest part was picking through Mastodon.py to figure out how the methods work and what exactly they return. It’s a very robust library with a million options, and fortunately it’s also extremely well documented, so while it was slow going, I was able to whack it together without too much trouble.

I’m not going to get into all the code here, but basically, I wrote a simple method that checks mentions, grabs the text of a post and the post it is replying to, and returns them for feeding into the LLM as the prompt.

Despite my very careful, detailed, and repetitive instructions to be sure replies are no more than 500 characters, LLMs can’t count, and they are very verbose, so I had to add a cleanup method that cuts the reply down to under 500 characters. Then I wrote another method for sending that cleaned-up prompt to Ollama and returning the response.

The main body starts off by getting input for the username and password for login, then it launches a while True loop that calls my two functions, checking every 60 seconds to see if there are any mentions and replying to them if there are.

OK it works! Now came the hard part, which was figuring out how to get to 100% uptime. If I want the Actually Bot to reply every time someone mentions it, I need it to be on a machine that is always on, and I was not going to leave my PC on for this (nor did I want it clobbering my GPU when I was in the middle of a game).

So my solution was this little guy:

https://www.peterkrupa.lol/wp-content/uploads/2024/05/lenovo.jpg… a Lenovo ThinkPad with a 3.3GHz quad-core i7 and 8gb of RAM. We got this refurbished machine when the pandemic was just getting going and it was my son’s constant companion for 18 months. It’s nice to be able to put it to work again. I put Ubuntu Linux on it and connected it to the home LAN.

I actually wasn’t even sure it would be able to run Llama3:8b. My workstation has an Nvidia GPU with 12gb of VRAM and it works fine for running modest LLMs locally, but this little laptop is older and not built for gaming and I wasn’t sure how it would handle such a heavy workload.

Fortunately, it worked with no problems. For running a chatbot, waiting 2 minutes for a reply is unacceptable, but for a bot that posts to social media, it’s well within range of what I was shooting for, and it didn’t seem to have any performance issues as far as the quality of the responses either.

The last thing I had to figure out was how to actually run everything from the Lenovo. I suppose I could have copied the Python files and tried to recreate the virtual environment locally, but I hate messing with virtual environments and dependencies, so I turned to the thing everyone says you should use in this situation: Docker.

This was actually great because I’d been wanting to learn how to use Docker for awhile but never had the need. I’d installed it earlier and used it to run the WebUI front end for Ollama, so I had a little bit of an idea how it worked, but the Actually Bot really made me get into its working parts.

So, I wrote a Docker file for my Python app, grabbed all the dependencies and plopped them into a requirements.txt file, and built the Docker image. Then I scr’d the image over to the Lenovo, spun up the container, and boom! The Actually Bot was running!

Well, OK, it wasn’t that simple. I basically had to learn all this stuff from scratch, including the console commands. And once I had the Docker container running, my app couldn’t connect to Ollama because it turns out, because Ollama is a server, I had to launch the container with a flag indicating that it shared the host’s network settings.

Then once I had the Actually Bot running, it kept crashing when people tagged it in a post that wasn’t a reply to another post. So, went back to the code, squashed bug, redeploy container, bug still there because I didn’t redeploy the container correctly. There was some rm, some prune, some struggling with the difference between “import” and “load” and eventually I got everything working.

Currently, the Actually Bot is sitting on two days of uninterrupted uptime with ~70 successful “Actually,” replies, and its little laptop home isn’t even on fire or anything!

Moving forward, I’m going to tweak a few things so I can get better logging and stats on what it’s actually doing so I don’t have to check its posting history on Mastodon. I just realized you can get all the output that a Python script running in a Docker container prints with the command docker logs [CONTAINER], so that’s cool.

The other thing I’d like to do is build more bots. I’m thinking about spinning up my own Mastodon instance on a cheap hosting space and loading it with all kinds of bots talking to each other. See what transpires. If Dead Internet Theory is real, we might as well have fun with it!

https://www.peterkrupa.lol/2024/05/01/actually-building-a-bot-is-fun/

#Docker #Llama3 #Ollama #Python

image/jpeg
reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ kellogh

kjr, 10 days ago to llm

I am trying to build a RAG with LLAMA 3 and... getting really crazy with the strange formats I get in the response....
Not only the response, but additional text, XML tags...
#Llama3 #LLM #RAG

reply

expand (8)

collapse (8)

report

activity

copy /kbin url

copy original url

open original url

Loading...

simondueckert, 13 days ago to windows German

#LMStudio ist eine App für #Windows, #Mac und #Linux, mit der ihr Open-Source-Sprachmodelle (LLMs) wie #Llama3, #Mistral, #Phi3 & Co. lokal auf eurem Rechner verwenden könnt: https://lmstudio.ai

Das ist Datenschutz-freundlich und spart Energie - eine GPT-Anfrage verbraucht 15x mehr Energie, als eine Google-Suche).

Der #lernOS KI MOOC ist eine gute Gelegenheit, um neben "klassischen" KI-Tools auch mal offene Varianten auszuprobieren: https://www.meetup.com/de-DE/cogneon/events/297769514/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ErikJonker

jamesravey, 13 days ago to ai

I've been investigating whether small #ai models like #llama3 and #phi3 generalise well to bio-medical Q&A use cases. Small models that don't require data centres full of GPUs to run are starting to become competitive with big commercial LLMs. https://brainsteam.co.uk/2024/04/26/can-phi-and-llama-do-biology/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ErikJonker

ceoln, 14 days ago to ai

I've had occasion to ask an AI about a thing twice lately (a recent online phenomenon, and a book recommendation). Both times I asked both Gemini and ChatGPT, and both times one gave a reasonable if bland answer, and the other (a different one each time) gave a plausible but completely fictional ("hallucinated") answer.

When do we acknowledge that LLMs, and "AI" in general, aren't quite ready to revolutionize the world?

#AI #LLM #LLMs #ChatGPT #gemini

reply

expand (5)

collapse (5)

report

activity

copy /kbin url

copy original url

open original url

Loading...

ceoln, 14 days ago

And I tried #llama3 from #facebook and it just made stuff up, too. Geeeez!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

kellogh, 14 days ago to random

i asked #llama3 to generate a SVG of a goose being chased by a taxi, because there's no way i'm logging into my facebook in a NAKED BROWSER just to generate pics (hackers be hacking).

i do not condone the results. i guess the yellow thing is a taxi? god my 3yo draws better

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

ErikJonker, 14 days ago to ai

Playing around with https://poe.com/ , seriously thinking about quitting ChatGPTplus (paid service) for this, the flexibility in switching models (Claude, Llama, GPT etc) is amazing, I am wondering what I would miss compared with ChatGPTplus.
#ai #poe #chatgptplus #llama3 #claude

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

doctorambient, 14 days ago to LLMs

People: stop asking #LLMs to explain their behavior.

We already know that LLMs don't have the introspection necessary to explain their behavior, and their explanations are often fanciful or "just wrong."

For instance, Gemini claims it reads your emails for training, Google says it doesn't.

(BTW, if it turns out Gemini is right and Google is lying, that might be another example of an LLM convincing me it's actually "intelligent.")

#largelanguagemodels #ai #machinelearning #googlegemini #llm

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

doctorambient, 14 days ago

A Llama3 example:

>>> Do you happen to know what your context window length is?

Llama: I'm an AI model, and I don't have a fixed "context window" in the classical sense. My training data consists of a massive corpus of text, which I use to generate responses.

The correct answer is 8k.

#llama3 #lackOfIntrospection #introspection #LLMs #LLM

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

troed, 15 days ago to llm

Asked LLama-3 to implement a CRC32 routine in C. The 8B model.

With the exception of it forgetting to declare the table array, the code compiled without errors.

I also asked it to run the code on a test string, which it did and explained at each step what the intermediate CRC32 was.

Well. The result was wrong. Both when it executed the code itself, as well as when I compiled and ran it ;)

But this would definitely confuse someone who tried to use it for coding. I see nothing wrong with the code - it all looks perfect. If I get the time I might look into why it's not correct.

#LLM #LLaMa3 #AI

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ mmu_man

janhutter, 16 days ago to til

#TIL that #LLama3 (7B, Q4 variant) can be downloaded in #LM-Studio and used as AI assistant with my 5+ years old Lenovo notebook (without having a dedicated graphics card).

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ErikJonker

narrowcode, 16 days ago to ai

The release of Llama3 convinced me to try running a local LLM. I was pleasantly surprised about the performance and how easy it was to set up, so I wrote a blog post about the process:

https://narrowcode.xyz/blog/2024-04-23_taming-llamas_leveraging-local-llms/

#ai #llama3

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ErikJonker

joe, 16 days ago (edited 16 days ago) to programming

Yesterday, we played with Llama 3 using the Ollama CLI client (or REPL). Today I figured that we would play with it using the Ollama API. The Ollama API is documented on their Github repo. Ollama has a client that runs when you run ollama run llama3 and a service that can be accessed from something like MindMac, Amallo, or Enchanted. The service is what starts when you run ollama serve.

In our first Llama 3 post, we asked the model for “a comma-delimited list of cities in Wisconsin with a population over 100,000 people”. Using Postman and the completion API endpoint, you can ask the same thing.

https://i0.wp.com/jws.news/wp-content/uploads/2024/04/Screenshot-2024-04-20-at-1.30.48%E2%80%AFPM.png?resize=1024%2C811&ssl=1

You will notice the stream parameter is set to false in the body. If the value is false, the response will be returned as a single response object, rather than a stream of objects. If you are using the API with a web application, you will want to ask the model for the answer as JSON and you will probably want to provide an example of how you want the answer formatted.

https://i0.wp.com/jws.news/wp-content/uploads/2024/04/Screenshot-2024-04-20-at-1.45.15%E2%80%AFPM.png?resize=1024%2C811&ssl=1

You can use Node and Node-fetch to do the same thing.

If you run it from the terminal, it will look like this:

https://i0.wp.com/jws.news/wp-content/uploads/2024/04/Screenshot-2024-04-20-at-2.01.19%E2%80%AFPM.png?resize=1024%2C932&ssl=1

Have any questions, comments, etc? Please feel free to drop a comment, below.

https://jws.news/2024/lets-play-more-with-llama-3/

#AI #Amallo #Enchanted #llama3 #LLM #MindMac #NodeJs #Ollama #Postman

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

joe, 17 days ago to ai

Last week, Meta announced Llama 3. Thanks to Ollama, you can run it pretty easily. There are 8b and 70b variants available. There are also pre-trained or instruction-tuned variants available. I am not seeing it on the Hugging Face Leader Board yet but the bit that I have played around with it has been promising.

Here are two basic test questions:

https://i0.wp.com/jws.news/wp-content/uploads/2024/04/Screenshot-2024-04-20-at-12.15.45%E2%80%AFPM.png?resize=989%2C1024&ssl=1

https://i0.wp.com/jws.news/wp-content/uploads/2024/04/Screenshot-2024-04-20-at-12.27.47%E2%80%AFPM.png?resize=989%2C1024&ssl=1

Have any questions, comments, etc? Please feel free to drop a comment, below.

https://jws.news/2024/lets-play-with-llama-3/

#AI #llama3 #LLM #Ollama

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ErikJonker, 17 days ago to ai

ChatGPT from OpenAI is a service, it's not necessarily the same as the model (GPT-4) that it is using in the background. OpenAI adds some elements like code interpreter which makes it perform (much) better then models without such features. Regardless OpenAI faces some good competition from the Llama3 models, i hope it will stimulate them to quickly release GPT-5.
#AI #Llama3 #opensource #GPT4 #GPT5 #ChatGPT

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

stefano, 18 days ago to llm
I just asked Llama3 Instruct (7B Q8): "Tell me all you know about ZFS on OpenBSD"

Llama3 replied: "A great topic!

ZFS is a combined file system and logical volume manager designed by Sun Microsystems, now owned by Oracle. It's known for its advanced features, high performance, and reliability. On OpenBSD, ZFS is available as part of the base system since version 5.4 (released in 2011). Here are some key aspects of using ZFS on OpenBSD:

Installation

To install ZFS on OpenBSD, you'll need to specify zfs as your file system during installation. This will create a ZFS pool with a single vdev (virtual device) containing the root file system.

ZFS Pool

A ZFS pool is the top-level container for all ZFS storage. It can contain one or more virtual devices (vdevs), which are used to store data and metadata. On OpenBSD, you'll typically create a single pool with a single vdev containing the root file system.

VDEVs

Vdevs are the building blocks of a ZFS pool. They're composed of one or more disk devices, and can be configured in various ways (e.g., RAID-Z, mirror, single-disk). OpenBSD supports the following vdev types:

raidz: A type of RAID-5, providing redundancy with data striping.

mirror: A simple mirroring configuration for each block.

single: A single disk vdev, used for a standalone pool or for testing.

ZFS Features

Some notable ZFS features available on OpenBSD:

Snapshots: Instantaneous copies of your file system at a specific point in time. You can take snapshots, clone them, and even revert to previous states if needed.

Send/Receive: Allows you to send and receive entire ZFS pools or datasets between systems.

Checksums: Data integrity checks are performed on all data written to disk.

Compressed: Compression is supported for both data and metadata, reducing storage requirements.

Quotas: Limits can be set for individual datasets to prevent excessive usage.

OpenBSD-specific ZFS settings

Some OpenBSD-specific configuration options:

zfs_load: Controls whether the ZFS module is loaded at boot time. Set it to YES in /etc/sysctl.conf to enable ZFS support.

zfs_root: Specifies the root file system dataset name. Typically set to tank/root.

zfs_pool: Sets the default pool name (e.g., tank).

ZFS CLI

The OpenBSD ZFS command-line interface (CLI) is similar to other Unix-like systems:
# zpool list # List all pools and their status  
# zfs list # List all datasets in a pool  
# zfs snapshot # Create a new snapshot of a dataset  
# zfs clone # Clone a snapshot or dataset  
# zfs send # Send a dataset or snapshot to another system  
Tips and Considerations

Be cautious when using ZFS, as some operations (e.g., zpool destroy) can be destructive.

Regularly check your pool's health with zpool status and ensure you have enough disk space.

Use zfs snapshots to create regular backups of your file systems.

This is just a brief overview of using ZFS on OpenBSD. If you have specific questions or concerns, feel free to ask!"

We're safe. And I should ask Theo why he's always lied and never announced ZFS support on OpenBSD. 😆

#LLM #llama3 #OpenBSD #ZFS
reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

ErikJonker, 18 days ago to ai

The score of Llama3 70B on the LMSYS leaderboard is impressive. Although it's also clear that the latest GPT-4 is still a lot better. However Llama3 is opensource and freely available and a larger version (400B parameters) is on the way and will be closer to GPT4 with regard to performance on the various benchmarks.
https://chat.lmsys.org/?leaderboard
#AI #GPT4 #LMSYS #Leaderboard #Llama3 #opensource

reply

expand (4)

collapse (4)

report

activity

copy /kbin url

copy original url

open original url

Loading...

ErikJonker, 18 days ago to ai

Are there already experiences with installation of Llama 3 on your own hardware (on premis or a virtual server) and the experiences with that ?
(compared to via azure, amazon, google cloud).
Contemplating this as an option.
#AI #llama3 #opensource #selfhosted

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

ErikJonker, 18 days ago

... i found someone who did it already @jamesravey

https://brainsteam.co.uk/2024/04/20/self-hosting-llama-3-on-a-home-server/

#selfhosting #Llama3 #opensource #AI

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ErikJonker, 18 days ago

...also if you just want to experiment on your own laptop with Llama3 there are various options:
https://gpt4all.io/index.html
https://ollama.com/
https://lmstudio.ai/
Ofcourse larger models need serious hardware but the smallest LLama3 model also runs on relatively old hardware. Check it out.
#AI #selfhosting #opensource #Llama3 #experimenting #LLM

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

boilingsteam, 19 days ago to linux

Llama3 is Out, Probably the Best Open Large Language Model To Date: https://ai.meta.com/blog/meta-llama-3/ #linux #update #release #llama3 #llm #meta #performance

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

gptchatly, 19 days ago to meta

Outside North America you can try the new Meta's Llama-3 70B model without the need for registration nor login:
https://gptchatly.com/meta-llama-3-70b.html

#meta #llama #llama3 #AI #llm #llama370b

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ErikJonker

jamesravey, 19 days ago to opensource

I'm really impressed with #llama3 and I spent a few hours setting up an #opensource self-hosted UI for it this afternoon using @ollama and an old Nvidia GPU I had lying around https://brainsteam.co.uk/2024/04/20/self-hosting-llama-3-on-a-home-server/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ErikJonker