Boosts - How can we evaluate the factual accuracy of long answers from LLMs? Researchers...

TedUnderwood, 1 month ago

How can we evaluate the factual accuracy of long answers from LLMs? Researchers from DeepMind / Stanford demonstrate a strategy that uses LLMs + search to assess factuality: it's more accurate than human evaluation and 20x cheaper. h/t Marc Lanctot on Threads arxiv.org/abs/2403.18802

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...