How to Make RAG Chatbots FAST

James Briggs

description

In this video we learn how to make Retrieval Augmented Generation (RAG) super fast for chatbots, Large Language Models (LLMs), or agents. We focus on how to design RAG / agent-powered conversational agents that use NVIDIA's NeMo Guardrails for decision-making on tool usage.

📕 Article: https://www.pinecone.io/learn/fast-retrieval-augmented-generation/

📌 Code: https://github.com/pinecone-io/examples/blob/master/learn/generation/chatbots/nemo-guardrails/03-rag-with-actions.ipynb

🌲 Subscribe for Latest Articles and Videos: https://www.pinecone.io/newsletter-signup/

👋🏼 AI Consulting: https://aurelio.ai

👾 Discord: https://discord.gg/c5QtDB9RAP

Twitter: https://twitter.com/jamescalam LinkedIn: https://www.linkedin.com/in/jamescalam/

00:00 Making RAG Faster 00:20 Different Types of RAG 01:03 Naive Retrieval Augmented Generation 02:22 RAG with Agents 05:06 Making RAG Faster 08:55 Implementing Fast RAG with Guardrails 11:02 Creating Vector Database 12:52 RAG Functions in Guardrails 14:32 Guardrails Colang Config 16:13 Guardrails Register Actions 17:03 Testing RAG with Guardrails 19:42 RAG, Agents, and LLMs ... https://www.youtube.com/watch?v=QMaWfbosR_E

created

2025-02-21

staked

0.0 LBC

license

Copyrighted (contact publisher)

File size

156883446 Bytes