Rudrani Ghosh
Master Thesis Project

EduRAG Tutor

Fine-Tuning LLM with RAG for Generating Educational Audio-Visual Content

An AI-powered learning assistant that delivers curriculum-aligned, grade-specific explanations in 7 languages with narrated educational videos — fine-tuned on 346K+ QA pairs using QLoRA on consumer hardware.

Try the EduRAG Tutor

Type any question. We search 0 curated topics and 0+ wiki subjects.

Class 1Mapped: Class 8Class 12

Architecture

1

Wikipedia Extraction

5,782 CBSE/ICSE topics fetched via Wikipedia API

2

BART Coherence Enhancement

Raw text cleaned and rewritten for readability

3

GPT-4o Mini Grade Adaptation

12 progressively complex versions per topic

4

Question Paraphrasing

5 semantic variations per topic-grade pair

5

QLoRA Fine-Tuning

TinyLlama 1.1B fine-tuned on 346,920 QA pairs

6

RAG + Web Search + AV Output

FAISS retrieval, DuckDuckGo fallback, typewriter videos

Tech Stack

TinyLlama-1.1B

Base LLM fine-tuned with QLoRA (4-bit)

FAISS + MiniLM

Dense retrieval over Wikipedia corpus

GPT-4o Mini

Grade-level paraphrasing & question augmentation

MyMemory API

Round-trip validated translation to 6 languages

edge-tts + ffmpeg

Typewriter animation with narration video

Wikipedia API

Curriculum-aligned knowledge source

Evaluation Metrics

9.72

Perplexity

Fluent, coherent generation

0.032

BLEU

High paraphrase diversity

0.121

ROUGE-1 F1

Token-level alignment

0.108

ROUGE-L F1

Sequence structure retention

Translation Quality

Back-translation BLEU scores for round-trip validation across Indian languages:

🇮🇳Hindi
0.617
🇮🇳Kannada
0.536
🇮🇳Telugu
0.533
🇮🇳Bengali
0.532
🇮🇳Punjabi
0.508
🇮🇳Tamil
0.504

Key Contributions

Consumer-Grade Fine-Tuning

Achieved effective domain adaptation of a 1.1B parameter model using QLoRA with 4-bit quantization, enabling training on a MacBook Pro M2 with 16GB unified memory.

Scalable Data Augmentation

Automated pipeline expanded 5,782 topics into 346,920 diverse QA pairs through systematic grade adaptation and neural paraphrasing.

Multimodal Output

Extended text generation into audio-visual educational content with synchronized typewriter animations and multilingual TTS narration.

RAG with Triple Fallback

Hybrid retrieval system combines FAISS-based semantic search, real-time DuckDuckGo web search, and direct Wikipedia fetching for maximum coverage.

Submitted in partial fulfillment of the requirements for the degree of Master of Science in Data Science, VIT Vellore, May 2025.

Supervisor: Dr. Sathyanarayana Sharma K, School of Advanced Sciences.