Turn any PDF into an intelligent AI-powered Q&A system using Retrieval-Augmented Generation (RAG).
This project implements a Retrieval-Augmented Generation (RAG) pipeline that allows users to ask questions from a PDF document and receive context-aware, accurate answers along with source references.
Instead of relying solely on LLM knowledge, this system retrieves relevant chunks from the document and feeds them into the model β ensuring grounded and reliable responses.
- π Load PDF directly from a URL
- π Intelligent text chunking for better retrieval
- π§ Semantic search using embeddings
- ποΈ Vector storage using ChromaDB
- β‘ Fast inference using Groq LLM
- π RetrievalQA pipeline with LangChain
- π Returns answers with source context
- π‘ Scalable and modular design
- Python
- LangChain
- ChromaDB (Vector Database)
- HuggingFace Embeddings
- Groq API (LLM - LLaMA 3)
- Unstructured (PDF Loader)
β PDF Document β
β
βΌ
β Unstructured File Loader β
β
βΌ
β Text Chunking (Splitter) β
β
βΌ
β HuggingFace Embeddings β
β
βΌ
β Chroma Vector Database β
β
βΌ
β Retriever β
β
βΌ
β Groq LLM (LLaMA 3) β
β
βΌ
β Final Answer + Sources