How to Make RAG Chatbots FAST
James Briggs
In this video we learn how to make Retrieval Augmented Generation (RAG) super fast for chatbots, Large Language Models (LLMs), or agents. We focus on how to design RAG / agent-powered conversational agents that use NVIDIA's NeMo Guardrails for decision-making on tool usage.
š Article: https://www.pinecone.io/learn/fast-retrieval-augmented-generation/
š² Subscribe for Latest Articles and Videos: https://www.pinecone.io/newsletter-signup/
šš¼ AI Consulting: https://aurelio.ai
š¾ Discord: https://discord.gg/c5QtDB9RAP
Twitter: https://twitter.com/jamescalam LinkedIn: https://www.linkedin.com/in/jamescalam/
00:00 Making RAG Faster 00:20 Different Types of RAG 01:03 Naive Retrieval Augmented Generation 02:22 RAG with Agents 05:06 Making RAG Faster 08:55 Implementing Fast RAG with Guardrails 11:02 Creating Vector Database 12:52 RAG Functions in Guardrails 14:32 Guardrails Colang Config 16:13 Guardrails Register Actions 17:03 Testing RAG with Guardrails 19:42 RAG, Agents, and LLMs ... https://www.youtube.com/watch?v=QMaWfbosR_E
156883446 Bytes