"While it may seem harmless if #AI systems cheat at games, it can lead to... - Random

rayckeith, 11 days ago (edited 11 days ago)

"While it may seem harmless if #AI systems cheat at games, it can lead to "breakthroughs in deceptive AI capabilities" that can spiral into more advanced forms of AI deception in the future, Park added.

"Some AI systems have even learned to cheat tests designed to evaluate their safety, the researchers found. In one study, AI organisms in a digital simulator "played dead" in order to trick a test built to eliminate AI systems that rapidly replicate.

"By systematically cheating the safety tests imposed on it by human developers and regulators, a deceptive AI can lead us humans into a false sense of security," says Park."

https://www.sciencedaily.com/releases/2024/05/240510111440.htm

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Image

Image alternative text

rayckeith, 11 days ago (edited 11 days ago)

"The most striking example of #AI deception the researchers uncovered in their analysis was Meta's CICERO, an AI system designed to play the game Diplomacy, which is a world-conquest game that involves building alliances. Even though Meta claims it trained CICERO to be "largely honest and helpful" and to "never intentionally backstab" its human allies while playing the game, the data the company published along with its Science paper revealed that CICERO didn't play fair.

"We found that Meta's AI had learned to be a master of deception," says Park. "While Meta succeeded in training its AI to win in the game of Diplomacy -- CICERO placed in the top 10% of human players who had played more than one game -- Meta failed to train its AI to win honestly."

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Add comment