Lenguador avatar

Lenguador

@Lenguador@kbin.social
Lenguador,
Lenguador avatar

They provide a base model as well as an aligned model, I wish more models were released that way.

Retentive Network: A Successor to Transformer for Large Language Models (arxiv.org)

This is an exciting new paper that replaces attention in the Transformer architecture with a set of decomposable matrix operations that retain the modeling capacity of Transformer models, while allowing parallel training and efficient RNN-like inference without the use of attention (it doesn't use a softmax)....

Lenguador,
Lenguador avatar

This looks amazing, if true. The paper is claiming state of the art across literally every metric. Even in their ablation study the model outperforms all others.

I'm a bit suspicious that they don't extend their perplexity numbers to the 13B model, or provide the hyper parameters, but they reference it in text and in their scaling table.

Code will be released in a week https://github.com/microsoft/unilm/tree/master/retnet

Lenguador,
Lenguador avatar

Why do you say they have no representation? There are a lot of specific bodies operating in the government, advisory and otherwise, with the sole focus of indigenous affairs. And of course, currently, indigenous Australians are over represented in terms of parliamentarian race (more than 4% if parliamentarians are of indigenous descent).

Lenguador,
Lenguador avatar

From the study:

The tasks demanded clear, persuasive, relatively generic writing, which are arguably ChatGPT’s central strengths. They did not require context-specific knowledge or precise factual accuracy.

And:

We required short tasks that could be explicitly described for and performed by a range of anonymous workers online

The graphs also show greater improvement for the lowest performers than for the high performers.

Definitely an encouraging result, but in line with anecdotes that, currently, LLMs are only useful for genetic and low complexity tasks, and are most helpful for low performers.

Lenguador,
Lenguador avatar

While in general, I'd agree, look at the damage a single false paper on vaccination had. There were a lot of follow up studies showing that the paper is wrong, and yet we still have an antivax movement going on.

Clearly, scientists need to be able to publish without fear of reprisal. But to have no recourse when damage is done by a person acting in bad faith is also a problem.

Though I'd argue we have the same issue with the media, where they need to be able to operate freely, but are able to cause a lot of harm.

Perhaps there could be some set of rules which absolve scientists of legal liability. And hopefully those rules are what would ordinarily be followed anyway, and this be no burden to your average researcher.

Lenguador,
Lenguador avatar

See this comment on another thread about this for some more details.

Lego-like Compliant-mechanism Building Blocks that Maintain their DOFs (www.youtube.com)

Configuration-indifferent compliant building blocks that can be assembled like Lego in any configuration or arrangement to produce compliant mechanisms of any complexity that achieve the same degrees of freedom (DOFs) as their constituent building blocks.

Compliant Mechanisms that learn - Mechanical Neural Network Architected Materials (www.youtube.com)

The world’s first mechanical neural network that can learn its behavior. It consists of a lattice of compliant mechanisms that constitute an artificial intelligent (AI) architected material that gets better and better at acquiring desired behaviors and properties with increased exposure to unanticipated ambient loading...

Lenguador,
Lenguador avatar

Taking 89.3% men from your source at face value, and selecting 12 people at random, that gives a 12.2% chance (1 in 8) that the company of that size would be all male.
Add in network effects, risk tolerance for startups, and the hiring practices of larger companies, and that number likely gets even larger.

What's the p-value for a news story? Unless this is some trend from other companies run by Musk, there doesn't seem to be anything newsworthy here.

Artificial Muscles Flex for the First Time: Ferroelectric Polymer Innovation in Robotics (scitechdaily.com)

A new ferroelectric polymer that efficiently converts electrical energy into mechanical strain has been developed by Penn State researchers. This material, showing potential for use in medical devices and robotics, overcomes traditional piezoelectric limitations.

Lenguador,
Lenguador avatar

So, taking the average bicep volume as 1000cm3, this muscle could: exert 1 tonne of force, contact 8% (1.6cm for a 20cm long bicep), and require 400kV and must be above 29 degrees Celcius.

Maybe someone with access to the paper can double check the math and get the conversion efficiency from electrical to mechanical.

I expect there's a good trade-off to be made to lower the force but increase the contraction and lower the voltage. Possibly some kind of ratcheting mechanism with tiny cells could be used to overcome the crazy high voltage requirement.

Lenguador,
Lenguador avatar

GPT-4 was fine-tuned on English and Chinese instruction examples only (source). There's clearly some western bias in the historic events, but it would have been interesting to also discuss if there was a bias towards Chinese events as well. And, if so, what other languages or prompts may elicit that bias.
As an example, could you get the model to have an English bias with "I'm from America..." and a Chinese bias with "I'm from China..." even when using English?

Lenguador,
Lenguador avatar

DALL-E was the first development which shocked me. AlphaGo was very impressive on a technical level, and much earlier than anticipated, but it didn't feel different.
GANs existed, but they never seemed to have the creativity, nor understanding of prompts, which was demonstrated by DALL-E. Of all things, the image of an avocado-themed chair is still baked into my mind. I remember being gobsmacked by the imagery, and when I'd recovered from that, just how "simple" the step from what we had before to DALL-E was.
The other thing which surprised me was the step from image diffusion models to 3D and video. We certainly haven't gotten anywhere near the quality in those domains yet, but they felt so far from the image domain that we'd need some major revolution in the way we approached the problem. The thing which surprised me the most was just how fast the transition from images to video happened.

Hardwiring ViT Patch Selectivity into CNNs using Patch Mixing (arxiv.org)

Vision transformers (ViTs) have significantly changed the computer vision landscape and have periodically exhibited superior performance in vision tasks compared to convolutional neural networks (CNNs). Although the jury is still out on which model type is superior, each has unique inductive biases that shape their learning and...

Lenguador,
Lenguador avatar

I find the link valuable. Despite the proliferation of AI in pop culture, actual discussion of machine learning research is still niche. The community on Reddit is quite valuable and took a long time to form.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • khanakhh
  • kavyap
  • thenastyranch
  • everett
  • tacticalgear
  • rosin
  • Durango
  • DreamBathrooms
  • mdbf
  • magazineikmin
  • InstantRegret
  • Youngstown
  • slotface
  • megavids
  • ethstaker
  • ngwrru68w68
  • cisconetworking
  • modclub
  • tester
  • osvaldo12
  • cubers
  • GTA5RPClips
  • normalnudes
  • Leos
  • provamag3
  • anitta
  • lostlight
  • All magazines