Mark Pilgrim famously remarked that the difference between a corporate & personal blog is that you can say “motherfucker” on your personal blog.
I wonder if the human-vs-LLM language battleground will be similar. I wonder if there’ll be an incentive to write in an idiosyncratic highly-individual hard-edged never-ever-bland style just so as to sound like a human.
You don’t think there’ll be a battleground? Silly you.
If you're worried about #LLM-based #AI, you're focusing on the wrong thing and may lack imagination.
This, and related developments (I can't view them as advancements, knowing how this all will end) are what's going to end the human race as we've known ourselves.
Much good can be derived from technologies like this, but we—being as we are—will ultimately go much to far.
I don't think we're prepared for our instant evolution (and, separately, eventual mechanization).
I get a lot of pushback when I admonish people to accurately describe what an #LLM is doing - I'm told 'that ship has sailed' or 'just deal with the fact that people say they think'.
It matters. It fucking matters. It matters because using the wrong words for it indicates that people think those "aswers" are something that they're not - that they can never, ever be.
#Mastodon#Fediverse server admins: Have you considered adding a Terms of Use clause that prohibits the use of posts for training #GenerativeAI#AI#LLM without explicit user consent? I feel like abuse of user-generated content (text and other media included) for training is already upon us, and I wonder if we shouldn’t set ourselves up for legal recourse at some point in the future if we ever need to.
Social media is one of the most ready pools of materials for training models. This tends to continue the trend of generating profits for private corporations by harvesting “public” goods without compensation, especially from artists who work hard to create quality media. I hope the Fediverse can exclude itself from this phenomenon somehow.
I haven’t found ChatGPT’s composition to be all that compelling or well-done. I mean, it’s impressive that the technology can do it at all, but does anyone have an example of anything like a 3k word story that is well-written and fun to read with a unique voice? Not saying this could never happen (with future models), but every time I’ve prompted a story it’s pretty weak and obviously AI-generated. What am I missing? #ChatGPT#WritingCommunity#fiction#AI#MachineLearning#LLM
With this morning's IntelliJ update I started seeing these AI prompts. While it is exciting to see it coming to desktop software not just up running in the browser I'm still not touching these things until it goes to local only running models. Even if I trusted all these companies with all this data I'm sick of feeding evena higher precentage of our digital lives into the data lakes of the same companies or their proxies (yes I'm referring to you OpenAI). #JetBrains#AI#LLM#OpenAI Introducing JetBrains AI and the In-IDE AI Assistant | The JetBrains Blog
Very interesting discussion, should large language models / AI be opensource or not? Meta and Google disagree. Governments , regulators also have a role in this debate #AI#LLM#Opensource
“Prompt engineering” is such a bizarre line of work. You’re trying to convince a machine trained on a huge pile of (hopefully) human-generated text to produce some useful output by guessing what sequence of human-like words you must put in to make it likely that the model will produce coherent, human-like output that is good enough to pass downstream.
You really have no idea how your prompt caused the model to produce its output (yes, you understand its process, but not the actual factors that contribute to its decisions). If the output happens to be good, you still have no idea how far you can push your input before the model returns bad output.
Prompt engineers talk to the model like a human, because that’s the only mental model they have for predicting how it will respond to their inputs. It’s a very poor metaphor for programming, but there is nothing better to reach for.
As AI tools for writing become more common, let me throw one more worry into the mix: Students who write well without AI assistance may be falsely accused of #plagiarism by teachers using imperfect tools to detect AI-assisted writing.
Thinking some more about using external plugins to aid & guide LLM output over longer generation, and had a crazy idea which might just be workable in Oobabooga;
A plugin that adjusts generation parameters based on a pre-defined list & inline commands.
It would work something like 3rd party tool calling with parameters, BUT on seeing [Tempo:+] in the output, generation is paused, generation params are adjusted, and then resume generation.
i wish i knew more about comparing #embeddings. anyone have resources? one thing i’ve wondered is how to convert an embedding from a “point” to an “area” or “volume”. e.g. an embedding of a 5 paragraph essay will occupy a single point in embedding space, but if you broke it down (e.g. by paragraph), there would be several points and the whole would presumably be at the center. is there a way to trace the full space a text occupies in #embedding space? #LLMs#LLM#AI#NLP
Whenever I see OpenAI's Sam Altman with his pseudo-innocent glance, he always reminds me of Carter Burke from Aliens (1986), who deceived the entire spaceship crew in favor of his corporation, with the aim of getting rich by weaponizing a newly discovered intelligent lifeform.
AI-Generated Data Can Poison Future AI Models - Scientific American (archive.is)
As AI-generated content fills the Internet, it’s corrupting the training data for models to come. What happens when AI eats itself?