Node.js everyday: June 2025

Building a Lightweight RAG System from Scratch: A Deep Dive into "KnowledgeAI"

June 06, 2025

Introduction

What is a RAG

In the rapidly evolving world of artificial intelligence, Retrieval-Augmented Generation (RAG) has emerged as a game-changer for enhancing large language models (LLMs) with external knowledge. Traditional LLMs, while powerful, often suffer from hallucinations or outdated information because they're limited to their training data. RAG addresses this by combining retrieval mechanisms with generation, allowing models to pull relevant context from documents in real-time before generating responses.

What are we building and where?

Enter KnowledgeAI, a GitHub repository created by me - Eshwar Prasad Yaddanapudi. This project, hosted at RAG SINGLE DOCUMENT, stands out for its minimalist approach. Unlike many RAG implementations that rely on heavy frameworks like LangChain or LlamaIndex, KnowledgeAI builds everything from scratch using basic libraries. The goal? To demystify RAG's core components—document chunking, embedding generation, vector storage, and retrieval-based prompting—while promoting transparency and customizability.

How to get the best out of this repo?

This repository is particularly valuable for AI enthusiasts, developers, and educators who want to understand RAG "under the hood." It avoids black-box abstractions, encouraging users to write and debug their own logic. With just a few files, including a main Python script, a sample document, and detailed documentation, it serves as an educational tool for building foundational AI workflows. In this article, we'll explore the repo's contents in depth, breaking down its structure, code, and potential applications. Whether you're a beginner or an experienced engineer, KnowledgeAI offers insights into resource-efficient AI systems.

How are the code files written

Key themes include intentional commenting in the code to build intuition, avoidance of frameworks for better understanding, and emphasis on debuggability. It's designed for those preferring hands-on learning over plug-and-play tools. The project demonstrates end-to-end thinking: from file I/O to LLM orchestration, all while maintaining efficiency.

Overview of how the components come together

Core Components of KnowledgeAI

Code Repo:RAG SINGLE DOCUMENT CODE REPO

The heart

At its heart, KnowledgeAI breaks down RAG into modular steps. First, document chunking: The system processes plain text files by buffering paragraphs, detecting transitions via line breaks, and creating overlapping chunks. This overlap—appending the last two lines from previous paragraphs—ensures semantic continuity, preventing context loss at chunk boundaries.

Next, embeddings are generated using the sentence-transformers library with the 'all-MiniLM-L6-v2' model. This lightweight model converts text chunks into dense vectors, capturing meaning efficiently without high computational costs.

These embeddings are stored in a persistent vector database using ChromaDB, a local, open-source solution. ChromaDB allows for quick similarity searches, making retrieval fast and scalable for small setups.

Building a Lightweight RAG System from Scratch: A Deep Dive into "KnowledgeAI"

Introduction

What is a RAG

What are we building and where?

How to get the best out of this repo?

How are the code files written

Overview of how the components come together

Core Components of KnowledgeAI

The heart

About Me

Press

Contributors

Labels

recent posts

Blog Archive

Popular Posts