Little more hands-on with a certain #LLM for some time now.... - Random

go_shrumm, 1 year ago

Little more hands-on with a certain #LLM for some time now.

Once I learned what the "stop sequence" is actually good for, my instinctive ascription of at least a little bit of personality to the thing disappeared immediately.

It does not "know" how to stop by itself.

#ai (or rather: #ml)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Image

Image alternative text

apublicimage, 1 year ago

@go_shrumm
Whats a stop sequence?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

go_shrumm, 1 year ago (edited 1 year ago)

@apublicimage It appears that one asks the system to produce, say, 100 tokens as a continuation of another 100 tokens of prompt. It then produces 100 tokens - and can stop in the middle of a sentence.

To get a chat one obviously trains with chat data that contains speaker marks like "**Human", "**Bot". It then continues an input like "**Human: Hi!" with 100 tokens, inventing reply, question, reply...

With stop sequence "**Human:" it stops after the first reply, and it feels like a chat.

1/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

go_shrumm, 1 year ago

@apublicimage This is why people call it "autocomplete". I rather call it "autocontinuation". It's a pontentially endless stream of words. That stream is simply cut off after n tokens - or if a certain sequence of characters appears, the stop sequence.

The chat experience comes from a certain application that is built on top of that and plays a little bit of theatre.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

go_shrumm, 1 year ago

@apublicimage If you like to play yourself at home:

https://github.com/oobabooga/text-generation-webui

A gaming computer with a more recent NVIDIA graphics card with at least 16GB is recommended, however.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

go_shrumm, 1 year ago

@apublicimage And the model:

https://vicuna.lmsys.org/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

apublicimage, 1 year ago

@go_shrumm
Thanks for the thread and the links,that's interesting!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Add comment