Google Gemini aims for a 10 mil tokens context. It's so large that you can put books, docs, videos. They all fit in this context size. Will this replace RAG?
Don't think so because:
-💸 money; you still pay per token
-🐢 slow response time
-🐞 a huge context is hard to debug
@daniel_js_craft long context can’t ever replace RAG because regular boring computer I/O that’s already been optimized to kingdom come isn’t fast enough to send megabytes or gigabytes to an API on a per-request basis, no matter how fast LLMs get
but it does open up some very interesting use cases, so it’s absolutely worth paying attention to
Add comment