🎓 CampusKnowledgeBase (CK Base)

🎯 One-Liner

Elevating Campus Learning through Retrieval-Augmented Generation. CK Base is a high-performance AI assistant that synthesizes course materials into cited, verifiable answers for university students.

Note: The current implementation is built using course materials from our college (FY Semester 1 and Computer Engineering Semester 3). Support for other colleges and programs will be added in the future as access to their academic resources becomes available.

🚨 The Problem

Students face massive information overload during exam prep, with hundreds of pages of fragmented PDFs across various platforms.

Unverifiability: General LLMs often hallucinate facts not present in the official syllabus.
Speed: Manually searching through 100+ page PDFs for a single concept is inefficient.
Accuracy: Academic queries require precise grounding in verified institutional data.

✨ Key Features

⚡ Sub-Millisecond Retrieval: Leveraging FAISS for low-latency semantic search across thousands of document fragments.
📊 Accuracy Scoring: Every response includes an Accuracy Score (0-1), generated by a secondary LLM "judge" checking for grounding.
📅 Semester-Aware Filtering: Metadata-locked retrieval ensures answers are specific to the student's current year (FY/SY) and semester.
🛡️ Institutional Security: Integrated with Google OAuth 2.0, restricted to verified university domains.
📍 Precision Citations: Automatic references to the specific PDF and subject used to generate the answer.

🛠️ Tech Stack

Component	Technology	Purpose
Backend API	Flask + Python	REST API, RAG orchestration
AI/ML	Google Gemini API	LLM for answer generation & evaluation
Embeddings	Google Text Embeddings	Semantic text representation
Vector Database	FAISS	Sub-millisecond similarity search
Frontend	Next.js + TypeScript	Interactive chat UI
Authentication	Google OAuth 2.0	Secure student sign-in
Infrastructure	Flask Dev Server	Can be deployed on Cloud Run
Data	Campus course PDFs	Processed into chunks + embeddings

🚀 Getting Started

Environment Setup

Create a .env file in the root directory with the following variables:

# Google Gemini API
GEMINI_API_KEY=your_gemini_api_key_here

# Flask Configuration
FLASK_SECRET_KEY=your_secret_key_here
FLASK_ENV=development

# Frontend URL (for CORS)
FRONTEND_URL=http://localhost:3000

# Google OAuth (optional, for auth)
GOOGLE_CLIENT_ID=your_google_client_id
GOOGLE_CLIENT_SECRET=your_google_client_secret

Installation Steps

1️⃣ Clone the Repository

git clone https://github.com/SohaKhare/CampusKnowledgeBase.git
cd CampusKnowledgeBase

2️⃣ Backend Setup (Python + Flask)

# Navigate to backend
cd aiml

# Initialize and sync the virtual environment (.venv)
uv sync

# Activate the virtual environment
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Start Flask server
uv run main.py

Server runs at: http://localhost:8000

3️⃣ Frontend Setup (Next.js)

# Navigate to frontend
cd frontend

# Install dependencies
npm install  # or yarn install

# Start development server
npm run dev  # or yarn dev

Frontend runs at: http://localhost:3000

📁 Project Structure

CampusKnowledgeBase/
├── .gitignore               # Root ignore file (node_modules, .venv, .env)
├── README.md                # The main documentation file (Paste content here)
│
├── aiml/                    # AI/ML Backend (Flask)
│   ├── .venv/               # Virtual environment (managed by uv)
│   ├── data/                # Institutional Knowledge (PDFs)
│   │   ├── FY/              # First Year materials
│   │   └── SY/              # Second Year materials
│   ├── ingestion/           # Pipeline to process PDFs into vectors
│   │   ├── ingest.py        # Main script to run indexing
│   │   └── chunker.py       # PDF splitting logic
│   ├── routes/              # Flask Blueprints for API endpoints
│   │   ├── auth_routes.py   # Google OAuth logic
│   ├── askllm.py            # Gemini API integration & Scoring logic
│   ├── rag.py               # FAISS retrieval logic
│   ├── config.py            # Environment & app configuration
│   ├── embedder.py          # Vector embedding generation (Google AI)
│   ├── extensions.py        # Shared Flask extensions (DB, Auth, etc.)
│   ├── pyproject.toml       # uv dependency management file
│   └── main.py              # Backend entry point
│
└── frontend/                # Next.js Frontend
    ├── public/              # Static assets (logos, icons)
    ├── src/
    │   ├── app/             # Next.js App Router (pages)
    │   ├── components/      # UI components (Chat, Sidebar, Navbar)
    │   ├── lib/             # Utility functions (API callers)
    │   └── types/           # TypeScript interface definitions
    ├── .env.local           # Frontend environment variables
    ├── package.json         # Node.js dependencies
    └── next.config.ts       # Next.js configuration

📊 How Accuracy Scoring Works

To eliminate AI hallucinations, we implement a Self-Correction Loop:

Retrieval
The system performs a semantic search to fetch the most relevant content chunks from the local data/ store based on the student’s query.
Verification
Gemini cross-checks the generated answer against the retrieved chunks, ensuring that every statement is grounded in the original source material.
Confidence Check
A grounding score (ranging from 0.0 to 1.0) is generated. If the score falls below a predefined threshold, the UI warns the student and encourages cross-verification using the cited PDF sources.

📧 Contact & Support

Have questions or found a bug? Open an issue or reach out to us!

Built with ❤️ by Saish, Shaurya, Soha and Bhoumik.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎓 CampusKnowledgeBase (CK Base)

🎯 One-Liner

🚨 The Problem

✨ Key Features

🛠️ Tech Stack

🚀 Getting Started

Environment Setup

Installation Steps

1️⃣ Clone the Repository

2️⃣ Backend Setup (Python + Flask)

3️⃣ Frontend Setup (Next.js)

📁 Project Structure

📊 How Accuracy Scoring Works

📧 Contact & Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
aiml		aiml
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

🎓 CampusKnowledgeBase (CK Base)

🎯 One-Liner

🚨 The Problem

✨ Key Features

🛠️ Tech Stack

🚀 Getting Started

Environment Setup

Installation Steps

1️⃣ Clone the Repository

2️⃣ Backend Setup (Python + Flask)

3️⃣ Frontend Setup (Next.js)

📁 Project Structure

📊 How Accuracy Scoring Works

📧 Contact & Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages