Hugging Face LLMs with SageMaker + RAG with Pinecone

James Briggs

description

In this video, we'll learn how to build Large Language Model (LLM) + Retrieval Augmented Generation (RAG) pipelines using open-source models from Hugging Face deployed on AWS SageMaker. We use the MiniLM sentence transformer to power our semantic search component with Pinecone.

📌 Code: https://github.com/pinecone-io/examples/blob/master/learn/generation/aws/sagemaker/sagemaker-huggingface-rag.ipynb

📕 Article: https://www.pinecone.io/learn/sagemaker-rag/

🌲 Subscribe for Latest Articles and Videos: https://www.pinecone.io/newsletter-signup/

👋🏼 AI Consulting: https://aurelio.ai

👾 Discord: https://discord.gg/c5QtDB9RAP

Twitter: https://twitter.com/jamescalam LinkedIn: https://www.linkedin.com/in/jamescalam/

00:00 Open Source LLMs on AWS SageMaker 00:27 Open Source RAG Pipeline 04:25 Deploying Hugging Face LLM on SageMaker 08:33 LLM Responses with Context 10:39 Why Retrieval Augmented Generation 11:50 Deploying our MiniLM Embedding Model 14:34 Creating the Context Embeddings 19:49 Downloading the SageMaker FAQs Dataset 20:23 Creating the Pinecone Vector Index 24:51 Making Queries in Pinecone 25:58 Implementing Retrieval Augmented Generation 30:00 Deleting our Running Instances

#artificialintelligence #nlp #aws #opensource #chatbot ... https://www.youtube.com/watch?v=0xyXYHMrAP0

created

2025-02-21

staked

0.0 LBC

license

Copyrighted (contact publisher)

File size

230879565 Bytes