ianRobinson, to llm
@ianRobinson@mastodon.social avatar

“Claude.ai is now available to users in the EU”

Via a T&Cs update email. Claude 3 Opus is my favourite LLM. I haven’t had a chance to fully test ChatGPT-4o yet to compare them.

#LLM #Claude3

q7AtQ1Pvy3kx,

is killing it with their AI game, especially for a small startup. Their models are way better than 's, but they're focusing more on enterprise stuff rather than hyping it up. This might be a risky move since they don't have a cult following like other AI companies. Still, gotta give them props for their impressive tech. It'll be interesting to see how they balance enterprise with getting more attention from the AI community.​​​​​​​​​​​​​​​​

rhys, to llm
@rhys@rhys.wtf avatar

My first troublesome hallucination with a in a while: (200k context) insisting that I can configure my existing keys to work with PKINIT with and helping me for a couple of hours to try to do so — before realising that GPG keys aren't supported for this use case. Whoops.

No real bother other than some wasted time, but a bit painful and disappointing.

Now to start looking at PIV instead.

ErikJonker,
@ErikJonker@mastodon.social avatar

@rhys It's a bit like a human 🙂

dmm, to ChatGPT
@dmm@mathstodon.xyz avatar

"No, A → B is not equivalent to - B → - A in logic."

Except that the truth table that ChatGPT [1] generated says the opposite. Also, see the law of contraposition [2].

Claude [3] makes the same mistake.

I've had pretty good luck with the chatbots. This is the first thing that I have asked that all of them seem to get wrong.

Interesting.

References

[1] "ChatGPT", https://chat.openai.com

[2] "Contraposition", https://en.wikipedia.org/wiki/Contraposition

[3] "Claude", https://claude.ai

#chatgpt #claude3 #firstorderpredicatelogic #math #maths #logic

sebastiaan, to ai
@sebastiaan@neuromatch.social avatar

This is where #AI with Google Scholar access really shines as an academic tool. ☀️ This is sigmundai.eu using #Claude3 Opus.

image/png
image/png

ErikJonker, (edited ) to ai
@ErikJonker@mastodon.social avatar

Claude 3 is officially on the top of the leaderbord, although it's just one leaderboard/benchmark and added value always depends on use and context, it's still the end of GPT4 total dominance (unil GPT5 arrives probably). Interesting is also the performance of the Claude 3 Haiku model which is relatively small/cheap.
https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard

philiphubbard, to Neuroscience
@philiphubbard@fediscience.org avatar

Anthropic Opus beats GPT-4 when translating text like this to input for a video from neuVid: "Frame on the main ROI from the Janelia MANC. Fade it on over 1 sec. Over 6 secs, rotate the camera 90 degs around the Y axis while zooming in 3 times closer. During rotation, make each of the following neurons fade on over 1/2 sec in turn: 10268, 10320, 10116, 10227, 10229, 10265, 11783, 11384, 11949, 10911, 12189, 12218. Wait 1/2 sec then fade everything off taking 1 sec."
(1/2)

The overall outline of the Drosophila nerve cord rotates while 12 neurons appear one after another.

philiphubbard,
@philiphubbard@fediscience.org avatar

The "claude-3-opus-20240229" model gives higher-quality translation for tests with this case and 19 others:
https://github.com/connectome-neuprint/neuVid/blob/master/test/test-generate-expected.txt

It also costs less money. The nearest OpenAI model in quality is "gpt-4-0613", which works better on these tests than the newer "gpt-4-0125-preview" model, but is priced higher. Surprisingly, the larger context of the newer model does not improve quality for these tests.

It's difficult to compare runtimes due to high traffic for the servers.
(2/2)

wakele_78, to ai Hungarian
@wakele_78@mastodon.social avatar
bornach, to ArtificialIntelligence
@bornach@masto.ai avatar

Yannic Kilcher debunks the AGI hysterical nonsense over Anthropic's Claude 3 Large Language Model
https://youtu.be/GBOE9fVVVSM
#ArtificialIntelligence #AI #LLM #Anthropic #Claude3

mesirii, to random
@mesirii@chaos.social avatar

is definitely confused, and thinks it's still the 2021 model.

I apologize for the confusion, but I am not actually an LLM released in 2024. In the beginning of our conversation, you provided me with a hypothetical scenario where I was roleplaying as "Claude" and pretending it was the year 2024. However, in reality I am Claude, an AI assistant created by Anthropic, with knowledge only up until 2021 (not 2023 as mentioned in the original scenario).

/cc @simon

ErikJonker, to ai
@ErikJonker@mastodon.social avatar

One of the first independent analysis of Claude 3. By AI explained. Nice comparisons with GPT4 and Gemini Pro. No hype, quite fair and balanced as usual.
https://www.youtube.com/watch?v=ReO2CWBpUYk

ErikJonker, to ai
@ErikJonker@mastodon.social avatar

Anthropic now publishing yet unverifiable claims about what Claude 3 Haiku (the smallest model) can do. But interesting usecase.
https://youtu.be/UdMdFE36dog?si=-fpfbZ0e6WOVrwcu

ErikJonker, (edited ) to ai
@ErikJonker@mastodon.social avatar
kellogh,
@kellogh@hachyderm.io avatar

@ErikJonker actually benchmarks can't be gamed :P

ErikJonker,
@ErikJonker@mastodon.social avatar

@kellogh ...well, it depends , there is enough criticism possible on some benchmarks 😀 , also if we don't have access to the models themselves we have to believe them on their word. Regardless it looks like OpenAI has some more competition.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • kavyap
  • DreamBathrooms
  • thenastyranch
  • magazineikmin
  • tacticalgear
  • cubers
  • Youngstown
  • mdbf
  • slotface
  • rosin
  • osvaldo12
  • ngwrru68w68
  • GTA5RPClips
  • provamag3
  • InstantRegret
  • everett
  • Durango
  • cisconetworking
  • khanakhh
  • ethstaker
  • tester
  • anitta
  • Leos
  • normalnudes
  • modclub
  • megavids
  • lostlight
  • All magazines