Retrieval Augmented Era (RAG) is a helpful approach for utilizing your individual information in an AI-powered Chatbot. On this weblog submit, I’ll stroll via three key methods to get essentially the most out of RAG and consider every technique to seek out the perfect combos.
For readers who simply wish to know the TL;DR conclusion: essentially the most RAG accuracy enchancment got here from exploring completely different chunking methods.
- 89% Enchancment by altering the Chunking Technique 📦
- 20% Enchancment by altering the Embedding Mannequin 🤖
- 6% Enchancment by altering the LLM Mannequin 🧪
Let’s dive into every technique and discover the best-performers for a real-world RAG software utilizing RAG part evaluations! 🚀📚
I’ll use Milvus documentation public internet pages because the docs information and Ragas because the analysis technique. See my earlier blog about how to use RAGAS. The remainder of this weblog is organized as follows:
- Textual content Chunking Methods
- Embedding Fashions
- LLM (Generative) Fashions