Better Llama 2 with Retrieval Augmented Generation (RAG)

James Briggs

description

Retrieval Augmented Generation (RAG) allows us to keep our Large Language Models (LLMs) up to date with the latest information, reduce hallucinations, and allow us to cite the original source of information being used by the LLM.

We build the RAG pipeline using a Pinecone vector database, a Llama 2 13B chat model, and wrap everything in Hugging Face and LangChain code.

📌 Code: https://github.com/pinecone-io/examples/blob/master/learn/generation/llm-field-guide/llama-2/llama-2-13b-retrievalqa.ipynb

🌲 Subscribe for Latest Articles and Videos: https://www.pinecone.io/newsletter-signup/

👋🏼 AI Consulting: https://aurelio.ai

👾 Discord: https://discord.gg/c5QtDB9RAP

Twitter: https://twitter.com/jamescalam LinkedIn: https://www.linkedin.com/in/jamescalam/

00:00 Retrieval Augmented Generation with Llama 2 00:29 Python Prerequisites and Llama 2 Access 01:39 Retrieval Augmented Generation 101 03:53 Creating Embeddings with Open Source 06:23 Building Pinecone Vector DB 08:38 Creating Embedding Dataset 11:45 Initializing Llama 2 14:38 Creating the RAG RetrievalQA Component 15:43 Comparing Llama 2 vs RAG Llama 2

#artificialintelligence #nlp #opensource #llama2 ... https://www.youtube.com/watch?v=ypzmPwLH_Q4

created

2025-02-21

staked

0.0 LBC

license

Copyrighted (contact publisher)

File size

116872578 Bytes