Hugging Face LLMs with SageMaker + RAG with Pinecone
James Briggs
In this video, we'll learn how to build Large Language Model (LLM) + Retrieval Augmented Generation (RAG) pipelines using open-source models from Hugging Face deployed on AWS SageMaker. We use the MiniLM sentence transformer to power our semantic search component with Pinecone.
š Article: https://www.pinecone.io/learn/sagemaker-rag/
š² Subscribe for Latest Articles and Videos: https://www.pinecone.io/newsletter-signup/
šš¼ AI Consulting: https://aurelio.ai
š¾ Discord: https://discord.gg/c5QtDB9RAP
Twitter: https://twitter.com/jamescalam LinkedIn: https://www.linkedin.com/in/jamescalam/
00:00 Open Source LLMs on AWS SageMaker 00:27 Open Source RAG Pipeline 04:25 Deploying Hugging Face LLM on SageMaker 08:33 LLM Responses with Context 10:39 Why Retrieval Augmented Generation 11:50 Deploying our MiniLM Embedding Model 14:34 Creating the Context Embeddings 19:49 Downloading the SageMaker FAQs Dataset 20:23 Creating the Pinecone Vector Index 24:51 Making Queries in Pinecone 25:58 Implementing Retrieval Augmented Generation 30:00 Deleting our Running Instances
#artificialintelligence #nlp #aws #opensource #chatbot ... https://www.youtube.com/watch?v=0xyXYHMrAP0
230879565 Bytes