[Other] Beyond Language Models: Byte Models are Digital World Simulators

Abstract: Traditional deep learning often overlooks bytes, the basic units of the digital world, where all forms of information and operations are encoded and manipulated in binary format. Inspired by the success of next token prediction in natural language processing, we introduce bGPT, a model with next byte prediction to simulate the digital world. bGPT matches specialized models in performance across various modalities, including text, audio, and images, and offers new possibilities for predicting, simulating, and diagnosing algorithm or hardware behaviour. It has almost flawlessly replicated the process of converting symbolic music data, achieving a low error rate of 0.0011 bits per byte in converting ABC notation to MIDI format. In addition, bGPT demonstrates exceptional capabilities in simulating CPU behaviour, with an accuracy exceeding 99.99% in executing various operations. Leveraging next byte prediction, models like bGPT can directly learn from vast binary data, effectively simulating the intricate patterns of the digital world.

Lay summary (by Claude): Most deep learning models process text, audio, images and other data in human readable formats. But computers operate on streams of binary data - sequences of 1s and 0s. Researchers have developed a new deep learning model called bGPT that tries to simulate and predict sequences of bytes (groups of 8 bits). It works by trying to guess the next byte that will appear in a sequence, similar to how language models try to predict the next word. Remarkably, despite only seeing raw bytes and no human-readable data, bGPT can process text, audio, images and other modalities about as well as specialized deep learning models for those data types. It can even replicate intricate computer operations like converting a symbolic music format to MIDI audio or executing CPU machine code instructions. Instead of needing different models for different data types, bGPT learns directly from binary data. This demonstrates how models that understand the language of bytes could simulate many aspects of computing and unlock new applications.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • aicompanions@lemmy.world
  • tacticalgear
  • DreamBathrooms
  • khanakhh
  • mdbf
  • InstantRegret
  • magazineikmin
  • everett
  • cubers
  • rosin
  • Youngstown
  • slotface
  • ngwrru68w68
  • kavyap
  • thenastyranch
  • JUstTest
  • modclub
  • Durango
  • GTA5RPClips
  • cisconetworking
  • osvaldo12
  • ethstaker
  • Leos
  • tester
  • anitta
  • normalnudes
  • provamag3
  • megavids
  • lostlight
  • All magazines