@LChoshen@sigmoid.social
@LChoshen@sigmoid.social avatar

LChoshen

@LChoshen@sigmoid.social

🥇 #NLProc researcher

🥈 Opinionatedly Summarizing new #ML & #NLP papers

🥉 Good science #scientivism

Now: @ibmresearch phd:@nlphuj

This profile is from a federated server and may be incomplete. Browse more on the original instance.

LChoshen, to llm
@LChoshen@sigmoid.social avatar

Do LLMs learn foundational concepts required to build world models? (less than expected)

We address this question with 🌐🐨EWoK (Elements of World Knowledge)🐨🌐

a flexible cognition-inspired framework to test knowledge across physical and social domains

https://ewok-core.github.io

LChoshen, to random
@LChoshen@sigmoid.social avatar

Pretrain to predict the future
At each step the model predicts n-tokens
Performance: 😃
Inference time: ✖️3
Training time: same

MetaAI
Fabian Gloeckle, Badr Youbi Idrissi, Baptiste Rozière, David Lopez-Paz, Gabriel Synnaeve

https://arxiv.org/abs/2404.19737

LChoshen, to generativeAI
@LChoshen@sigmoid.social avatar

DoRA explores the magnitude and direction and
surpasses LoRA quite significantly

This is done with an empirical finding that I can't wrap my head around

@nvidia
https://arxiv.org/abs/2402.09353

LChoshen, to ArtificialIntelligence
@LChoshen@sigmoid.social avatar

Happy to share our paper:

Genie🧞: Achieving Human Parity
in Content-Grounded Datasets Generation

was accepted to

From your content
Genie creates content-grounded data
of magical quality ✨
Rivaling human-based datasets!

https://arxiv.org/abs/2401.14367
a

LChoshen, to random
@LChoshen@sigmoid.social avatar

English Code models are better than Chinese
on Chinese
They hallucinate less
They generalize better

If true, this defies our thoughts on LMs as domain experts
https://arxiv.org/abs/2401.10286

LChoshen, to opensource
@LChoshen@sigmoid.social avatar

Crowd-sourcing human feedback for open-source LLMs? 💬🤖

Let's make it happen together! 💪

https://chromewebstore.google.com/detail/sharelm-share-your-chat-c/nldoebkdaiidhceaphmipeclmlcbljmh

With Shachar Don-Yehiya and Omri Abend

LChoshen,
@LChoshen@sigmoid.social avatar

ShareLM is a chrome plugin that makes it easy for you to contribute your own human-model interactions.

The goal -> collecting an ever-growing dataset of conversations, for the benefit of the open-source community 💬🥳

And this is so easy, so no excuses!

https://chromewebstore.google.com/detail/sharelm-share-your-chat-c/nldoebkdaiidhceaphmipeclmlcbljmh

LChoshen, to modeltrains
@LChoshen@sigmoid.social avatar

keynote
(with my live jetlagged interpretation)
from
@StableDiffusion
creator:
scaling is not the solution
A keynote to restart the debate

LChoshen,
@LChoshen@sigmoid.social avatar

@StableDiffusion First, my take, It is not the solution, but remember how many people said it is not a solution, and who (
@OpenAI
\
@ilyasut
) said you need more engineering and more scale
There is a lot to gain even from less appealing ideas
Also from appealing ideas
Extremism is often simplistic

LChoshen,
@LChoshen@sigmoid.social avatar

@StableDiffusion The main problem with scaling, fewer players are able to compete!
Me: I keep telling you that! But, we can have other technologies that scale, and also use expertise from the community as scaled and distributed, each evolving the model slightly.

LChoshen,
@LChoshen@sigmoid.social avatar

@StableDiffusion The second issue with scaling is data (
@blancheminerva can probably link to her great counter argument, couldn't find it)
We would require more data.
Me: and compute and other problems, scaling is hard too. Don't mix easy with no new algorithm (scaling).

LChoshen,
@LChoshen@sigmoid.social avatar

@StableDiffusion Also more data may come with copyright problems.
(
@shayneredford what are your thoughts on that? The two must be connected? not sure)

LChoshen,
@LChoshen@sigmoid.social avatar

@StableDiffusion @shayneredford Now we got to the more interesting part for me.
There are hard things to find in the data, there are rare thoughts and ideas that are hard to capture, those will be still rare if scaled
Me: interesting. still, we will get more of those with time, unless training ignores rarities

LChoshen, to ArtificialIntelligence
@LChoshen@sigmoid.social avatar

The language people use when they interact with each other changes over the course of the conversation.

🔍 Will we see a systematic language change along the interaction of human users with a text-to-image model?


http://arxiv.org/abs/2311.12131

Shachar Don-Yehia
me
&
Omri Abend

LChoshen, to random
@LChoshen@sigmoid.social avatar

What are the strongest\canonical papers that discuss data quality?

LChoshen, to random
@LChoshen@sigmoid.social avatar

Larger models are better😱
But...
Can we train smaller models to be better?
Can we learn about language learning?

Our baby👶, babyLM challenge in the
@nytimes
:
https://www.nytimes.com/2023/05/30/science/ai-chatbots-language-learning-models.html
⭐️🌟
@a_stadt @amuuueller @weGotlieb @jhuclsp @EvaPortelance & @sama

LChoshen, to random
@LChoshen@sigmoid.social avatar

Opposite scaling law: detection of machine-generated text is done better by smaller models

Everyone (outside ...) is afraid GPT would cheat for them, which pushes for detection methods

https://arxiv.org/abs/2305.09859

LChoshen,
@LChoshen@sigmoid.social avatar

First the problem, given a text you want to know whether a human wrote it. You've been in NLP lately I am sure a teacher, sister, nephew etc. called and told you they suspect someone handed them a GPT text.
Problem: how can you tell
The approach
Randomly replace words
Then see how much it changed the sentence probability\likelihood

presented by
https://arxiv.org/abs/2301.11305

nsaphra, to Stoicism
@nsaphra@sigmoid.social avatar

I accidentally posted something under my 2022 thread but it's time to live in the future! So this is officially the beginning of my 2023 book thread!

LChoshen,
@LChoshen@sigmoid.social avatar

@nsaphra hmm, interesting in the life hacks and relationships hacking context

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • GTA5RPClips
  • DreamBathrooms
  • InstantRegret
  • magazineikmin
  • osvaldo12
  • Youngstown
  • ngwrru68w68
  • slotface
  • everett
  • rosin
  • thenastyranch
  • kavyap
  • tacticalgear
  • megavids
  • modclub
  • normalnudes
  • cubers
  • ethstaker
  • mdbf
  • Durango
  • khanakhh
  • tester
  • provamag3
  • cisconetworking
  • Leos
  • anitta
  • lostlight
  • All magazines