Meta Superintelligence Labs Introduces REFRAG: Scaling RAG with 16× Longer Contexts and 31× Faster Decoding
Table of contents Why is long context such a bottleneck for LLMs? How does REFRAG compress and shorten context? How is acceleration achieved? How does REFRAG preserve accuracy? What do the experiments reveal? Summary FAQs A staff of researchers from Meta Superintelligence Labs, National University of Singapore and Rice University has unveiled REFRAG (REpresentation For…
