Saturday, August 17, 2024

RAG with local quantized LLM model

While there are many Retrieval Augmented Generation (RAG) post out there, many of them rely on external LLM APIs such as OpenAI or Amazon Bedrock, etc. 

How well do they work with quantized LLM models that can run locally? I got one here, and it works quite well for me.

No comments:

Post a Comment