#Anthropic - Posts - kbin.social

kellogh, 14 hours ago to LLMs

i’m very excited about the interpretability work that #anthropic has been doing with #LLMs.

in this paper, they used classical machine learning algorithms to discover concepts. if a concept like “golden gate bridge” is present in the text, then they discover the associated pattern of neuron activations.

this means that you can monitor LLM responses for concepts and behaviors, like “illicit behavior” or “fart jokes”

https://www.anthropic.com/research/mapping-mind-language-model

reply

expand (4)

collapse (4)

report

activity

copy /kbin url

copy original url

open original url

Loading...

kellogh, 14 hours ago

this is great work. i’m excited to see where this goes next

i hope #anthropic exposes this via their API. at this point in time, most of the promising interpretability work is only available on open source models that you can run yourself. it would be great to also have them available from #AI vendors

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Lobrien, 10 hours ago

@kellogh This does, of course, imply vastly easier subversion of guardrails. Bad actors will have an easier time manipulating bias.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

br00t4c, 2 days ago to llm

Here's what's really going on inside an LLM's neural network

#anthropic #llm

https://arstechnica.com/?p=2026236

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ianRobinson, 2 days ago to llm

Research paper from Anthropic.

“Today we report a significant advance in understanding the inner workings of AI models. We have identified how millions of concepts are represented inside Claude Sonnet, one of our deployed large language models. This is the first ever detailed look inside a modern, production-grade large language model. This interpretability discovery could, in future, help us make AI models safer.”

#LLM #Anthropic https://www.anthropic.com/research/mapping-mind-language-model

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

br00t4c, 3 days ago to random

New Anthropic Research Sheds Light on AI's 'Black Box'

#anthropic #thepeople

https://gizmodo.com/new-anthropic-research-sheds-light-on-ais-black-box-1851491333

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

upright, 9 days ago to random

Why would #anthropic require a phone number to use its app? NOPE.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ Binder

br00t4c, 14 days ago to random

Anthropic's founders took a shot at OpenAI executives

#anthropic

https://qz.com/anthropic-founders-openai-executives-ai-1851469940

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

br00t4c, 15 days ago to random

Anthropic Founders Publicly Needle OpenAI Execs

#anthropic

https://gizmodo.com/anthropic-founders-publicly-needle-openai-execs-1851467231

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

br00t4c, 15 days ago to OpenAI

Anthropic co-founders say their AI models are taking lessons from the harms of social media

#anthropic #openai

https://qz.com/anthropic-safe-ai-bloomberg-technology-summit-amodei-1851466207

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

br00t4c, 15 days ago to random

6 Practical Tips for Using Anthropic's Claude Chatbot

#anthropic

https://www.wired.com/story/six-practical-tips-for-using-anthropic-claude-chatbot/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

robert, 16 days ago to emacs

org-ai got an update today. It now supports the #anthropic #claude and the #perplexity.ai APIs.

https://github.com/rksm/org-ai

#emacs #orgmode #llms

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ sachac

ErikJonker, 22 days ago to ai

Interesting, "Maestro - A Framework for Claude Opus, GPT and local LLMs to Orchestrate Subagents",
i think organising tasks, orchestrating various agents will be important.
https://github.com/Doriandarko/maestro
#maestro #anthropic #AI #LLM #orchestration

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

br00t4c, 23 days ago to random

Anthropic Brings Claude AI To the iPhone and iPad

#anthropic

https://apple.slashdot.org/story/24/05/01/2147233/anthropic-brings-claude-ai-to-the-iphone-and-ipad

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

br00t4c, 23 days ago to random

Anthropic releases Claude AI chatbot iOS app

#anthropic

https://arstechnica.com/?p=2021092

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

br00t4c, 23 days ago to random

Anthropic finally releases a Claude mobile app

#anthropic #claude

https://www.theverge.com/2024/5/1/24145983/anthropic-claude3-model-mobile-app-team-plan

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

br00t4c, 23 days ago to random

Anthropic Wants to Put Its Claude AI Wherever You Are With New App

#anthropic

https://gizmodo.com/anthropic-claude-ai-ios-apple-app-1851448285

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...