@chikim@mastodon.social I've really been enjoying VOLlama. Nice work! Would be... - Random

bryansmart, 1 month ago

@chikim I've really been enjoying VOLlama. Nice work! Would be nice to be able to switch between OpenAI and local models without going in to API prefs. More accelerator keys for menu options would be good, too. Could maybe a blank line be inserted in the log between each entry? Last, can you trap key-down on the Control key to stop the system voice? I know it's a hobby project, so no idea how much time you have for any of that, but just throwing them out there.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Image

Image alternative text

chikim, 1 month ago

@bryansmart Thanks. What do you mean by blank line between each entry? Like user: bla bla blank line llama: bla bla blank line? This would be easiest request to implement. Switching between platform is little tricky bc I have to keep track of what model you used with which platform. I'm sure there's a way, but catching modifier by itself will be also tricky. Pause/resume will be significantly more work because how each system implements api call for tts. Also I'm using threading to feed text.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 1 month ago

@chikim Re blank lines, that's right. This is part of a struggle, after each query, of finding the beginning of the latest response. Similar help might be if the current browsing position in the log didn't always move to the end, but stayed where it was left. Not sure if that's possible.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 1 month ago

@chikim For changing platform, if you can't remember the last model, that's okay, but maybe move the platform selector combo box to the toolbar? Right now, have to go to menus, select correct menu, select option, select model combo box, pick model, dismiss dialog. If you can't move it to the toolbar, how about an accelerator for that API Settings dialog?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago

@bryansmart the Alternative is press alt or option+up to go into the edit mode, and it'll paste only one message at a time into the prompt field.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 1 month ago

@chikim I'm not sure what you mean. I tried those commands here, and they just navigate through the text, like normal.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago

@bryansmart Alt or option+up/down will let you edit the history context. It'll show one message at a time into the prompt field and lets you edit. If you want to just review it without editing it, just escape or alt/option+dwn all the way to the bottom when you're done reading.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 1 month ago

@chikim Ah. That does what I wanted. If you hadn't told me, how would I have discovered this? Commands aren't in the menu, and there doesn't seem to be a Help item.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 1 month ago

@chikim Also, are the rag options incomplete? Whenever I've tried them with local models, I always get errors. I'm mainly using llama 3 and Command-R, if that matters. To start, I was trying to do things like feed them a folder of text files to see if I could ask questions about them, get summaries, etc, but get errors during the indexing.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago

@bryansmart Hmm, what kind of error? @vick21 also told me he got out of memory error during creating embedding.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 1 month ago

@chikim @vick21 Ollama call failed with status code 404. Details: model 'nomic-embed-text' not found, try pulling it first
Traceback (most recent call last):
File "RAG.py", line 74, in loadFolder
File "RAG.py", line 33, in build_index
File "llama_index/core/instrumentation/dispatcher.py", line 274, in wrapper
File "llama_index/core/base/embeddings/base.py", line 251, in get_text_embedding

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago

@bryansmart @vick21 ah, you forgot to download the embedding model. ollama pull nomic-embed-text

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 1 month ago

@chikim @vick21 I’m looking at your github page for it, and I see those directions are there. I didn’t even notice them when I started. I saw app, grabbed app, it worked, so that was that. Haha. I feel dumb. Sorry.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago

@bryansmart @vick21 No problem. I totally get it. I hate writing manual/user guide/readme as well as reading them. lol

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

vick21, 1 month ago

@chikim @bryansmart And this is what AI should be able to do for us. :)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago

@vick21 @bryansmart That's actually great idea! VOLLlama has only few files, so I should try feeding everything and ask to write manual for it. hahahaha

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 1 month ago

@chikim Any tips for getting started with RAG? I looked at the section in the ReadMe. I imported a folder of text files. I have the local llama3 model selected. I asked it to generally describe what the files discuss, and it told me about files used during training of llama3. Can I only use the nomic model?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago

@bryansmart Did you start your question with /q?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 1 month ago

@chikim No. I saw that in your ReadMe, but was thinking that was only for queries about URLs. Hmm. I’ll try.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago

@bryansmart Also, you can't ask general questions like summarize that require reading the entire thing. The way it works it compares your question against documents through embedding, and retrieve and feed few chunks that might be relevant to answer your question.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 1 month ago

@chikim Hmm. How is it better than general text search? I thought I’d be able to use it to generate new text in the style of indexed files and such.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago

@bryansmart If you want to see which chunks are fed into the model in order to answer your question, you can check the only checkbox in RAG settings. lol I don't think it's labeled in mac due to wx widget bug on Mac.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 1 month ago

@chikim Does using /q send the prompt only through the nomic model? That is to say, does the active model effect the rag results at all?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago

@bryansmart Embedding model is only used during indexing. The quality of the answer depends on the model you are chatting with. Because it will read the chunks in text and give you the answer. It really depends on if the LlamaIndex was able to retrieve the relevant chunks or not. You can increase number of chunks and chunk length, but you might end up feeding chunks that are not related to your question. Also there's threshold you can change to filter out below certain simularity score.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 1 month ago

@chikim Ah. The separate functions weren’t clear to me. Can I erase the index and start over? There isn’t a command in the menu to do that.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Add comment