ErikJonker, to ai
@ErikJonker@mastodon.social avatar

GPT4 does a good job analysing financial statements using Chain-of-Thought prompting, if this holds i see a lot of opportunities for other standardised reports in various domains.
https://bfi.uchicago.edu/working-paper/financial-statement-analysis-with-large-language-models/
#AI #LLM #GPT4 #finance #financialstatements #ChainofThought #CoT

ErikJonker, to ai Dutch
@ErikJonker@mastodon.social avatar

Played around with GPT-4o analysing pictures and analysing the same picture with Google Gemini (I have to admit , the free version) , but the differences are enormous, the amount of hallucinations in Google Gemini is insane, making things up about the picture provided...how can Google be so far behind ?

codewiz, to rust
@codewiz@mstdn.io avatar

Got #DeepSeek Coder 33B running on my desktop's #AMDGPU card with #ollama.

First off, I tested its ability to generate and understand #Rust code. Unfortunately, it falls into the same confusion of the smaller 6.7B model.

https://gist.github.com/codewiz/c6bd627ec38c9bc0f615f4a32da0490e
#ollama #llm #deepseek

codewiz,
@codewiz@mstdn.io avatar

To be completely fair, thread safety and atomics are advanced topics.

Several humans I have interviewed for engineering positions would also have a lot of trouble answering these questions. I couldn't write this code on a whiteboard without looking at the Rust library docs.

The main problem here is that the model is making up poor excuses to justify Arc<AtomicUsize>, showing poor reasoning skills.

Larger models like should do better with my questions (haven't tried yet).

smeg, to ai
@smeg@assortedflotsam.com avatar

GPT-4 didn't really score 90th percentile on the bar exam, MIT study finds
https://link.springer.com/article/10.1007/s10506-024-09396-9

  • All
  • Subscribed
  • Moderated
  • Favorites
  • megavids
  • InstantRegret
  • mdbf
  • osvaldo12
  • magazineikmin
  • cubers
  • rosin
  • thenastyranch
  • Youngstown
  • GTA5RPClips
  • slotface
  • khanakhh
  • kavyap
  • DreamBathrooms
  • JUstTest
  • Durango
  • everett
  • ethstaker
  • modclub
  • normalnudes
  • anitta
  • cisconetworking
  • ngwrru68w68
  • tacticalgear
  • Leos
  • provamag3
  • tester
  • lostlight
  • All magazines