Cloudvelous Chat Assistant

An intelligent chatbot powered by advanced RAG (Retrieval-Augmented Generation) with agentic workflows, continuous learning, and production-ready quality assessment. Built with FastAPI, PostgreSQL (pgvector), React, and multi-LLM integration.

Features

Backend

RAG Architecture: Semantic search over documentation using pgvector vector database
Multi-LLM Support: OpenAI GPT-4o-mini and Google Gemini 1.5 Flash integration with runtime switching
Production Embeddings: OpenAI text-embedding-3-small (1536-dim) for high-quality semantic search
Query Refinement: LLM-powered intelligent query enhancement for better retrieval
Curated Q&A System: Priority layer for instant answers to common questions (80% similarity threshold)
Agentic Workflows: Multi-step reasoning with learned templates (search, validate, refine, synthesize)
Workflow Templates: Automatic learning and reuse of successful reasoning patterns
Contextual Learning: Query-specific chunk boosting to prevent cross-category contamination
Quality Assessment: Multi-signal quality scoring with auto-assessment
Active Learning: Smart sample selection for efficient human review
Tool Execution: Flexible tool calling system for extended capabilities
Gap Detection: Automatic identification of missing knowledge and capabilities
Vector Reindexing: Automatic index optimization for performance
Admin Training Interface: Review sessions, provide feedback, and analyze performance metrics
Dynamic Training Mode: Interactive chat-based refinement of answers, workflows, and knowledge chunks
Self-Improving: Automatically adjusts retrieval accuracy weights based on user feedback
GitHub Integration: Production-ready automated repository documentation ingestion

Frontend

Public Chat Interface: Beautiful React UI with syntax highlighting and source attribution
Admin Dashboard: Live statistics, LLM performance analytics, and system metrics
Session Management: Advanced filtering, search, sorting, pagination, and export (CSV/JSON)
Dynamic Training Interface: Interactive chat-based training with gap detection and auto-implementation
Training System: Quick review mode, bulk operations, keyboard shortcuts, undo functionality
Workflow Visualization: Step-by-step workflow execution trace with chunk retrieval details
Template Management: Browse, search, and manage workflow templates
Inspector Tools: Deep session analysis and comparison capabilities
Secure Authentication: API key authentication with protected admin routes
Real-time Updates: React Query for efficient data fetching and optimistic updates
Responsive Design: Mobile-friendly interface with Tailwind CSS
Accessibility: ARIA labels, keyboard navigation, and screen reader support

Architecture

Tech Stack

Backend:

Framework: FastAPI (Python 3.11+) with async support
Database: PostgreSQL 16 with pgvector extension for vector similarity search
Embeddings: OpenAI text-embedding-3-small (1536-dim) for production use
- Note: Sentence Transformers (384-dim) supported but requires database migration
LLM Providers: OpenAI GPT-4o-mini (default), Google Gemini 1.5 Flash
Deployment: Docker Compose for local development and production

Frontend:

Framework: React 19 with TypeScript
Build Tool: Vite 7 for fast development and optimized builds
Styling: Tailwind CSS 3 for responsive design
State Management: TanStack React Query 5 (server state) + Zustand 5 (client state)
Routing: React Router 7 with protected routes
Validation: Zod 3.25 for runtime type safety
HTTP Client: Axios 1.13 with interceptors
Date Handling: date-fns 4.1 for date formatting
Code Display: React Syntax Highlighter 16.1 for code blocks
Testing: Vitest 4.0 + React Testing Library 16.3

Key Components

Embedding Service: Converts text to vector embeddings for semantic search
Retrieval Service: Finds relevant knowledge chunks using vector similarity
Generator Service: Generates responses using retrieved context and LLM
Workflow Learner: Learns from successful query patterns to boost future retrievals
Training System: Collects feedback and adjusts accuracy weights
Quality Assessor: Multi-signal quality scoring and auto-assessment
Active Learner: Smart sample selection for human review
Query Refiner: LLM-powered query enhancement
Template Manager: Workflow template CRUD and scoring
Action Executor: Multi-step workflow execution engine
Inspector Service: Session analysis and comparison tools
Dynamic Training Orchestrator: Interactive training chat coordination

Quick Start

Prerequisites

Docker and Docker Compose
Git
At least 8GB RAM available for Docker (OpenAI embeddings + LLM operations)
API keys for OpenAI and/or Google Gemini

Installation

Clone the repository

git clone <repository-url>
cd cloudvelous-chatbot

Configure environment variables
```
cp .env.example .env
```
Edit .env and set required values:
```
# Database
POSTGRES_PASSWORD=your_secure_password

# LLM Provider (choose one or both)
OPENAI_API_KEY=sk-your-openai-key
GEMINI_API_KEY=your-gemini-key
LLM_PROVIDER=openai  # or gemini

# Embedding Configuration
# Default provider is "openai" (1536-dim, text-embedding-3-small).
# Sentence-transformers (384-dim) is supported but requires a database migration.
# EMBED_PROVIDER=openai

# Feature Flags (all default to true in config.py; override here if needed)
# CURATED_QA_ENABLED=true
# AGENTIC_WORKFLOW_ENABLED=true
# CONTEXTUAL_BOOST_ENABLED=true
# AUTO_QUALITY_ASSESSMENT_ENABLED=true
# ACTIVE_LEARNING_ENABLED=true
# BENCHMARKING_ENABLED=true
# REINDEX_ENABLED=true
# DEBUG_WORKFLOW=false
# REFINE_ENABLED=true

# Agentic V1 Smart Review (Phase A1)
AGENTIC_V1_SMART_REVIEW=true

# GitHub Integration
GITHUB_TOKEN=ghp_your-github-token

# Security (generate with: openssl rand -hex 32)
ADMIN_JWT_SECRET=your-jwt-secret-min-32-chars
ADMIN_API_KEY=your-api-key
```
How to Generate a GitHub Personal Access Token

Step-by-Step Instructions:
1. Go to GitHub Settings
  - Navigate to https://github.com/settings/tokens
  - Or: Click your profile picture → Settings → Developer settings (left sidebar) → Personal access tokens → Tokens (classic)
2. Generate New Token
  - Click "Generate new token" → "Generate new token (classic)"
3. Configure Token
  - Note/Name: Give it a descriptive name like Cloudvelous Chatbot - Local Dev
  - Expiration: Choose an expiration period (recommend 90 days for development)
4. Select Scopes/Permissions
  
  For this project, you'll need:
  - ✅ repo (Full control of private repositories) - if accessing private repos
  - ✅ public_repo (Access public repositories) - if only accessing public repos
  - ✅ read:org (Read org and team membership) - if working with organization repos
  Minimum required: Just public_repo if you're only ingesting public documentation
5. Generate and Copy
  - Click "Generate token" at the bottom
  - ⚠️ IMPORTANT: Copy the token immediately! You won't be able to see it again
  - The token will start with ghp_
6. Add Token to Your Project
  - Add the token to your .env file as GITHUB_TOKEN=ghp_your_token_here
Start all services
```
docker compose up -d
```

Verify services are running

docker compose ps

Expected output:

NAME                            STATUS          PORTS
cloudvelous-chatbot-db          Up (healthy)    0.0.0.0:5432->5432/tcp
cloudvelous-chatbot-backend     Up              0.0.0.0:8000->8000/tcp
cloudvelous-chatbot-frontend    Up              0.0.0.0:5173->5173/tcp
cloudvelous-chatbot-postadmin   Up              0.0.0.0:5050->80/tcp

Check API health

curl http://localhost:8000/health

Expected response:

{"status":"healthy","version":"0.4.0","phase":"3"}

Access the API documentation
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
- pgAdmin: http://localhost:5050 (optional)
Access the Frontend (Included in Docker Compose)

The frontend starts automatically with docker compose up. Access:
- Public Chat: http://localhost:5173/
- Admin Login: http://localhost:5173/admin/login
- Session Management: http://localhost:5173/admin/sessions
- Training Interface: http://localhost:5173/admin/training
Admin Login:
- Use the ADMIN_API_KEY from your .env file
Development Mode (separate from Docker):
```
cd frontend
npm install
npm run dev
```

Usage

Option 1: Web Interface (Recommended)

The easiest way to use Cloudvelous is through the web interface. See Frontend Web Interface in the User Manual for detailed instructions.

For End Users (Public Chat):
- Navigate to http://localhost:5173/
- Type your questions in the chat input
- View responses with source attribution
- No authentication required
For Admins (Sessions & Training):
- Navigate to http://localhost:5173/admin/login
- Enter your admin API key (from .env)
- Access session management and training interface
- Manage, search, and export chat sessions

Option 2: API Endpoints (Direct)

Ask a Question

curl -X POST http://localhost:8000/api/ask \
  -H "Content-Type: application/json" \
  -d '{
    "query": "How do I implement authentication?",
    "llm_provider": "openai",
    "llm_model": "gpt-4o-mini"
  }'

Submit Feedback

curl -X POST http://localhost:8000/api/train \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": 123,
    "is_correct": true,
    "chunk_feedback": [
      {"chunk_id": 1, "was_useful": true}
    ]
  }'

View Admin Sessions

curl -X POST http://localhost:8000/api/admin/sessions \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-admin-api-key" \
  -d '{
    "skip": 0,
    "limit": 10,
    "feedback_status": "pending"
  }'

Get Performance Metrics

curl -X GET http://localhost:8000/api/admin/stats \
  -H "X-API-Key: your-admin-api-key"

Dynamic Training

# Start dynamic training session
curl -X POST http://localhost:8000/api/admin/training/dynamic/session \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-admin-api-key" \
  -d '{
    "mode": "edit",
    "session_id": 123
  }'

# Send chat message in training
curl -X POST http://localhost:8000/api/admin/training/dynamic/chat \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-admin-api-key" \
  -d '{
    "training_session_id": "uuid",
    "message": "Add more information about authentication"
  }'

Quality Assessment

# Get quality signals for a session
curl -X GET http://localhost:8000/api/quality/signals/{session_id} \
  -H "X-API-Key: your-admin-api-key"

Workflow Templates

# List workflow templates
curl -X POST http://localhost:8000/api/admin/templates/list \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-admin-api-key" \
  -d '{"skip": 0, "limit": 20}'

# Get template details
curl -X GET http://localhost:8000/api/admin/templates/{template_id} \
  -H "X-API-Key: your-admin-api-key"

Complete API Documentation

Visit http://localhost:8000/docs for interactive API documentation with:

All available endpoints
Request/response schemas
Try-it-out functionality
Authentication requirements

Project Structure

cloudvelous-chatbot/
├── backend/
│   ├── app/
│   │   ├── main.py              # FastAPI application entry point
│   │   ├── config.py            # Configuration management (pydantic-settings)
│   │   ├── database.py          # Database connection & session
│   │   ├── exceptions.py        # Custom exception classes
│   │   ├── routers/             # API route handlers
│   │   │   ├── chat.py          # /api/ask endpoint
│   │   │   ├── training.py      # /api/train endpoint
│   │   │   ├── admin.py         # /api/admin/* endpoints
│   │   │   ├── dynamic_training.py  # /api/admin/training/dynamic/*
│   │   │   ├── workflows.py     # /api/workflows/* endpoints
│   │   │   ├── quality.py       # /api/quality/* endpoints
│   │   │   ├── admin_templates.py   # /api/admin/templates/*
│   │   │   ├── inspector.py     # /api/embedding-inspector/*
│   │   │   ├── iterative_sessions.py  # /api/sessions/iterative/*
│   │   │   ├── workflow_candidates.py # /api/workflows/candidates/*
│   │   │   ├── offline_training.py    # /api/workflows/offline/*
│   │   │   ├── v2_metrics.py    # /api/v2-metrics/* endpoints
│   │   │   ├── phase_a_admin.py # /api/admin/phase-a/*
│   │   │   └── phase_b_admin.py # /api/admin/phase-b/*
│   │   ├── services/            # Business logic layer
│   │   │   ├── embedder.py      # Text embedding generation
│   │   │   ├── retriever.py     # Semantic search & ranking
│   │   │   ├── generator.py     # LLM response generation
│   │   │   ├── workflow_learner.py      # Pattern learning
│   │   │   ├── workflow_planner.py      # Multi-step planning
│   │   │   ├── workflow_regenerator.py  # Dynamic workflow regeneration
│   │   │   ├── workflow_adapter.py      # Workflow adaptation
│   │   │   ├── workflow_distiller.py    # Workflow distillation
│   │   │   ├── action_executor.py       # Workflow execution
│   │   │   ├── template_learner.py      # Template management
│   │   │   ├── workflow_tracer.py       # Reasoning capture
│   │   │   ├── workflow_serializer.py   # Workflow serialization
│   │   │   ├── session_orchestrator.py  # Session orchestration
│   │   │   ├── contextual_boost.py      # Query-specific learning
│   │   │   ├── quality_metrics.py       # Quality calculation
│   │   │   ├── auto_quality_assessor.py # Auto-quality scoring
│   │   │   ├── active_learner.py        # Smart sample selection
│   │   │   ├── benchmark_tracker.py     # Performance tracking
│   │   │   ├── feedback_analyzer.py     # Feedback analysis
│   │   │   ├── missing_capability_detector.py  # Gap detection
│   │   │   ├── cannot_answer_detector.py  # Cannot-answer detection
│   │   │   ├── reindex_service.py       # Vector reindexing
│   │   │   ├── dynamic_training_service.py    # Interactive training
│   │   │   ├── training_chat_orchestrator.py  # Chat coordination
│   │   │   ├── inspector_service.py     # Session analysis
│   │   │   ├── template_admin_service.py  # Template management
│   │   │   ├── admin_service.py         # Admin operations
│   │   │   ├── v2_metrics.py            # V2 metrics service
│   │   │   ├── shadow_mode.py           # Shadow mode (V1/V2 comparison)
│   │   │   ├── phase_a_metrics.py       # Phase A observability
│   │   │   ├── phase_b_regenerator_eval.py  # Phase B evaluation
│   │   │   └── debug_workflow_logger.py # Workflow debug logging
│   │   ├── models/              # SQLAlchemy ORM models
│   │   │   ├── embeddings.py    # KnowledgeChunk model
│   │   │   ├── training_sessions.py     # TrainingSession model
│   │   │   ├── embedding_links.py       # Chunk-session links
│   │   │   ├── feedback.py              # TrainingFeedback model
│   │   │   ├── curated_qa.py            # CuratedQA model
│   │   │   ├── qa_suggestions.py        # QASuggestion model
│   │   │   ├── workflow_templates.py    # WorkflowTemplate model
│   │   │   ├── workflow_actions.py      # WorkflowAction model
│   │   │   ├── workflow_executions.py   # WorkflowExecution model
│   │   │   ├── workflow_execution_steps.py  # Step results
│   │   │   ├── workflow_step_chunks.py  # Chunks per step
│   │   │   ├── workflow_categories.py   # Taxonomy
│   │   │   ├── workflow_vectors.py      # Workflow embeddings
│   │   │   ├── workflow_candidates.py   # Workflow candidate tracking
│   │   │   ├── required_tools.py        # Tool tracking
│   │   │   ├── improvement_tasks.py     # Action items
│   │   │   ├── answer_quality_signals.py    # Quality signals
│   │   │   ├── active_learning_queue.py     # Learning queue
│   │   │   ├── benchmark_results.py         # Performance tracking
│   │   │   ├── learning_experiments.py      # A/B testing
│   │   │   ├── dynamic_training_sessions.py # Training sessions
│   │   │   ├── reindex_state.py         # Reindex state tracking
│   │   │   ├── conversation_context.py  # Conversation context
│   │   │   ├── shadow_comparison.py     # Shadow mode comparisons
│   │   │   ├── iterative_sessions.py    # Iterative sessions
│   │   │   ├── session_feedback.py      # Session feedback items
│   │   │   ├── offline_experiments.py   # Offline experiment data
│   │   │   ├── phase_a_metrics.py       # Phase A observability models
│   │   │   └── phase_b_regenerator_metrics.py  # Phase B eval models
│   │   ├── schemas/             # Pydantic request/response schemas
│   │   ├── llm/                 # LLM provider implementations
│   │   │   ├── base.py          # ILLMProvider interface
│   │   │   ├── factory.py       # Provider factory pattern
│   │   │   ├── openai_provider.py   # OpenAI integration (gpt-4o-mini)
│   │   │   └── gemini_provider.py   # Google Gemini integration (1.5-flash)
│   │   ├── training/            # Training & optimization
│   │   │   └── optimizer.py     # Accuracy weight optimizer
│   │   ├── middleware/          # Authentication & middleware
│   │   │   └── auth.py          # JWT & API key authentication
│   │   └── utils/               # Logging, rate limiting, utilities
│   ├── tests/                   # Unit and integration tests
│   │   ├── unit/                # Unit tests
│   │   ├── integration/         # Integration tests
│   │   └── ingestion/           # Ingestion pipeline tests
│   ├── alembic/                 # Database migrations
│   │   └── versions/            # Migration scripts
│   ├── requirements.txt         # Python dependencies
│   ├── pytest.ini               # Test configuration
│   └── Dockerfile               # Backend container image
├── scripts/
│   ├── init-db.sql              # Database initialization SQL
│   ├── initial_ingestion.py     # Data ingestion script
│   ├── manual_retrain.py        # Manual retraining trigger
│   └── ingestion/               # GitHub ingestion pipeline
│       ├── github_client.py     # GitHub API client
│       ├── file_fetcher.py      # Repository file fetcher
│       ├── text_chunker.py      # Semantic text chunking
│       ├── embedding_processor.py  # Embedding generation
│       └── db_writer.py         # Database storage (PostgreSQL/pgvector)
├── frontend/
│   ├── src/
│   │   ├── api/                 # API service layer
│   │   │   ├── client.ts        # Axios client configuration
│   │   │   ├── queryClient.ts   # TanStack React Query client
│   │   │   ├── admin.service.ts     # Admin API calls
│   │   │   ├── chat.service.ts      # Chat API calls
│   │   │   ├── training.service.ts  # Training API calls
│   │   │   ├── workflow.service.ts  # Workflow API calls
│   │   │   ├── health.service.ts    # Health check API calls
│   │   │   └── dynamic-training.service.ts  # Dynamic training API
│   │   ├── components/
│   │   │   ├── admin/
│   │   │   │   ├── DynamicTrainingChat.tsx
│   │   │   │   ├── DynamicTrainingQuickGuide.tsx
│   │   │   │   ├── WorkflowTrace.tsx
│   │   │   │   ├── WorkflowGapIndicator.tsx
│   │   │   │   ├── SessionInspectorModal.tsx
│   │   │   │   ├── SessionTable.tsx
│   │   │   │   ├── SessionRowActions.tsx
│   │   │   │   ├── QuickReviewCard.tsx
│   │   │   │   ├── FeedbackForm.tsx
│   │   │   │   ├── ChunkList.tsx
│   │   │   │   ├── ChunkCard.tsx
│   │   │   │   ├── ChunkSelectorModal.tsx
│   │   │   │   ├── ImprovementSuggestionsPanel.tsx
│   │   │   │   ├── CustomSuggestionModal.tsx
│   │   │   │   ├── Pagination.tsx
│   │   │   │   └── KeyboardShortcutsHelp.tsx
│   │   │   ├── chat/
│   │   │   │   ├── ChatInterface.tsx
│   │   │   │   ├── MessageList.tsx
│   │   │   │   ├── Message.tsx
│   │   │   │   └── MessageInput.tsx
│   │   │   ├── auth/
│   │   │   │   └── ProtectedRoute.tsx
│   │   │   └── ui/              # Reusable UI components
│   │   │       ├── Dialog.tsx
│   │   │       ├── Modal.tsx
│   │   │       ├── LoadingSpinner.tsx
│   │   │       └── EmptyState.tsx
│   │   ├── pages/               # Page components
│   │   │   ├── AdminLoginPage.tsx
│   │   │   ├── SessionsPage.tsx
│   │   │   ├── TrainingPage.tsx
│   │   │   └── NotFoundPage.tsx
│   │   ├── hooks/               # Custom React hooks
│   │   │   ├── useChat.ts
│   │   │   ├── useSession.ts
│   │   │   ├── useInspector.ts
│   │   │   ├── useSuggestions.ts
│   │   │   └── useKeyboardShortcuts.ts
│   │   ├── store/               # Zustand stores
│   │   │   └── chatStore.ts
│   │   ├── contexts/            # React contexts
│   │   │   └── AuthContext.tsx   # Auth context with useAuth hook
│   │   ├── config/              # Configuration
│   │   │   ├── api.config.ts
│   │   │   └── app.config.ts
│   │   ├── types/               # TypeScript types
│   │   ├── utils/               # Utility functions
│   │   ├── App.tsx              # Root component with routing
│   │   └── main.tsx             # Entry point
│   ├── public/                  # Static assets
│   ├── Dockerfile               # Frontend container image
│   ├── vite.config.ts           # Vite configuration
│   ├── tailwind.config.js       # Tailwind configuration
│   └── package.json             # Dependencies
├── docker-compose.yml           # Docker Compose (4 services)
├── .env.example                 # Environment variable template
├── USER_MANUAL.md               # Complete user guide with testing
├── DYNAMIC_TRAINING_MANUAL.md   # Dynamic training reference
├── QUICK_START_DYNAMIC_TRAINING.md  # 5-minute guide
└── README.md                    # This file

Development

Local Development Setup

# Install backend dependencies locally (optional)
cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

# Run locally without Docker
python -m uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Managing Services

# Start all services
docker compose up -d

# Start specific service
docker compose up -d backend

# Stop all services
docker compose down

# Stop without removing volumes
docker compose stop

# Restart service
docker compose restart backend

# Rebuild after code changes
docker compose up -d --build backend

# View logs
docker compose logs -f backend

# View resource usage
docker compose stats

# Clean up everything (including volumes)
docker compose down -v

Database Operations

# Access PostgreSQL shell
docker compose exec postgres psql -U chatbot_user -d cloudvelous_chatbot

# Common SQL commands
\dt          # List tables
\d tablename # Describe table
SELECT COUNT(*) FROM knowledge_chunks;  # Count chunks

# Run migrations
docker compose exec backend alembic upgrade head

# Create new migration
docker compose exec backend alembic revision --autogenerate -m "Add new column"

# Rollback last migration
docker compose exec backend alembic downgrade -1

# View migration history
docker compose exec backend alembic history

Testing

The project has comprehensive test coverage across unit, integration, and E2E test levels.

Test Categories:

Unit tests: Test individual components in isolation (default, always run)
Integration tests: Test component interactions with mocked dependencies (requires -m integration)
E2E tests: Test complete workflows with real PostgreSQL database (requires -m e2e and environment setup)

# Run unit tests only (default)
docker compose exec backend pytest

# Run all tests including integration tests
docker compose exec backend pytest -m "unit or integration"

# Run specific test categories
docker compose exec backend pytest -m unit          # Unit tests only
docker compose exec backend pytest -m integration   # Integration tests only
docker compose exec backend pytest -m e2e          # E2E tests only (requires setup)

# Run tests for specific features
docker compose exec backend pytest tests/unit/test_embedder.py
docker compose exec backend pytest tests/integration/test_training_router.py
docker compose exec backend pytest tests/ingestion/ -v

# Run with coverage report
docker compose exec backend pytest --cov=app --cov-report=html
open htmlcov/index.html  # View coverage report

# Run tests matching pattern
docker compose exec backend pytest -k "test_retrieval"

# Run with verbose output
docker compose exec backend pytest -v

E2E Testing (Real Database):

E2E tests verify actual database behavior with PostgreSQL + pgvector:

# Set up E2E test environment
export TEST_DATABASE_URL="postgresql://user:pass@localhost:5432/testdb"
export RUN_E2E_TESTS=true

# Run E2E tests for training feedback
docker compose exec backend pytest tests/integration/test_training_e2e.py -m e2e -v

# Run specific E2E test
docker compose exec backend pytest tests/integration/test_training_e2e.py::TestTrainingFeedbackE2E::test_submit_feedback_updates_was_useful_flags_in_real_db -xvs

Test Coverage:

Unit tests: 150+ tests
Integration tests: 50+ tests
E2E tests: 20+ tests
Frontend tests: Component and hook tests
Total: 220+ tests with comprehensive coverage

Code Quality

# Format code with Black
docker compose exec backend black app/

# Lint with Flake8
docker compose exec backend flake8 app/

# Type checking with MyPy
docker compose exec backend mypy app/

# Run all quality checks
docker compose exec backend sh -c "black app/ && flake8 app/ && mypy app/"

Configuration

All configuration is managed through environment variables defined in .env.

Core Settings

Variable	Description	Default	Required
`POSTGRES_PASSWORD`	PostgreSQL password	-	Yes
`DATABASE_URL`	Full database connection string	Auto-generated	No
`OPENAI_API_KEY`	OpenAI API key	-	If using OpenAI
`GEMINI_API_KEY`	Gemini API key	-	If using Gemini
`GITHUB_TOKEN`	GitHub API access token	-	For ingestion
`LLM_PROVIDER`	Default LLM provider	`openai`	No

Embedding Settings

Variable	Description	Default	Required
`EMBED_PROVIDER`	Embedding provider	`openai`	Yes
`OPENAI_EMBEDDING_MODEL`	OpenAI model name	`text-embedding-3-small`	If using OpenAI
`OPENAI_EMBEDDING_DIMENSIONS`	Embedding dimensions	`1536`	If using OpenAI
`SENTENCE_TRANSFORMERS_MODEL`	Sentence transformers model	`all-MiniLM-L6-v2`	If using sentence-transformers
`SENTENCE_TRANSFORMERS_DIMENSIONS`	Embedding dimensions	`384`	If using sentence-transformers

Feature Flags

Variable	Description	Default
`CURATED_QA_ENABLED`	Enable curated Q&A priority layer	`true`
`AGENTIC_WORKFLOW_ENABLED`	Enable multi-step workflows	`true`
`CONTEXTUAL_BOOST_ENABLED`	Enable query-specific learning	`true`
`AUTO_QUALITY_ASSESSMENT_ENABLED`	Enable auto-quality scoring	`true`
`ACTIVE_LEARNING_ENABLED`	Enable smart sample selection	`true`
`REFINE_ENABLED`	Enable LLM-based query refinement	`true`
`BENCHMARKING_ENABLED`	Enable performance tracking	`true`
`REINDEX_ENABLED`	Enable automatic reindexing	`true`
`DEBUG_WORKFLOW`	Enable detailed workflow tracing	`false`

Retrieval Settings

Variable	Description	Default
`TOP_K_RETRIEVAL`	Number of chunks to retrieve	`5`
`WORKFLOW_BOOST_FACTOR`	Boost for workflow matches	`1.2`
`CHUNK_WEIGHT_ADJUSTMENT_RATE`	Learning rate for weights	`0.1`
`MIN_CHUNK_WEIGHT`	Minimum accuracy weight	`0.5`
`MAX_CHUNK_WEIGHT`	Maximum accuracy weight	`2.0`

Workflow Learning

Variable	Description	Default
`WORKFLOW_EMBEDDING_ENABLED`	Enable workflow learning	`true`
`WORKFLOW_SIMILARITY_WEIGHT`	Weight for workflow similarity	`0.3`
`FEEDBACK_THRESHOLD_FOR_RETRAIN`	Feedback count trigger	`50`
`MIN_WORKFLOW_CLUSTER_SIZE`	Minimum cluster size	`3`

Quality Assessment Settings

Variable	Description	Default
`CANNOT_ANSWER_MIN_CONFIDENCE`	Minimum confidence for quality	`0.5`
`AUTO_LABEL_CONFIDENCE_THRESHOLD`	Threshold for auto-labeling	`0.8`
`ACTIVE_LEARNING_BUDGET`	Reviews per day	`50`

Security Settings

Variable	Description	Required
`ADMIN_JWT_SECRET`	JWT signing secret (min 32 chars)	Yes
`ADMIN_API_KEY`	Admin API authentication key	Yes
`JWT_ALGORITHM`	JWT signing algorithm	No (HS256)
`JWT_EXPIRATION_HOURS`	Token expiry time (hours)	No (24)

Monitoring & Debugging

Viewing Logs

# Follow all logs
docker compose logs -f

# Follow specific service
docker compose logs -f backend

# Last 100 lines
docker compose logs --tail=100 backend

# Export logs to file
docker compose logs > logs.txt

# Search logs for errors
docker compose logs backend | grep -i error

Health Checks

# Backend health
curl http://localhost:8000/health

# Database connection
docker compose exec backend python -c "
from app.models.database import SessionLocal
db = SessionLocal()
print('Connected:', db.execute('SELECT 1').scalar())
db.close()
"

# Check embedder
docker compose exec backend python -c "
from app.services.embedder import EmbeddingService
embedder = EmbeddingService()
print('Provider:', embedder.provider, '| Dimensions:', embedder.dimension)
"

Performance Monitoring

# Container resource usage
docker compose stats

# Database stats
docker compose exec postgres psql -U chatbot_user -d cloudvelous_chatbot -c "
SELECT
  schemaname,
  tablename,
  pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
"

# API endpoint stats (requires admin API key)
curl -H "X-API-Key: your-admin-api-key" \
  http://localhost:8000/api/admin/stats

Troubleshooting

Common Issues

Backend won't start - "ModuleNotFoundError"

# Rebuild container
docker compose up -d --build backend

Database connection refused

# Check PostgreSQL is running
docker compose ps postgres

# Restart PostgreSQL
docker compose restart postgres

# Wait a few seconds and restart backend
sleep 5
docker compose restart backend

Port already in use

# Find what's using the port
lsof -i :8000

# Kill the process or change port in docker-compose.yml

Slow query responses

# Check database indices
docker compose exec postgres psql -U chatbot_user -d cloudvelous_chatbot -c "\di"

# Check embedding count
docker compose exec postgres psql -U chatbot_user -d cloudvelous_chatbot -c \
  "SELECT COUNT(*) FROM knowledge_chunks;"

For more detailed troubleshooting, check the Docker logs and health endpoints described above.

GitHub Integration

The system includes a production-ready GitHub repository ingestion pipeline for automated documentation processing with comprehensive testing.

Status

The GitHub ingestion pipeline is production-ready with comprehensive testing:

✅ Complete end-to-end ingestion workflow
✅ OpenAI embeddings (1536-dim) in production use
✅ Transaction management with rollback support
✅ Comprehensive test coverage (220+ tests)
✅ CLI interface for repository ingestion
✅ Batch processing and error handling

Features

Repository file fetching with type filtering (markdown, code, docs)
Recursive directory traversal with skip patterns
Smart text chunking by markdown headers and paragraphs
Vector embedding generation with OpenAI text-embedding-3-small (1536-dim)
Database storage with PostgreSQL and pgvector extension
Transaction management with rollback on failure
Repository-level delete and re-insert strategy
Full orchestration pipeline with CLI (scripts/initial_ingestion.py)
Progress tracking and summary reporting
Error handling and partial failure recovery
End-to-end integration testing with real GitHub repositories
Support for batch processing and lazy model loading

Incremental Updates

For incremental repository updates with change detection:

Use the --clear-first flag to delete and re-ingest a repository
SHA-based change detection available for future optimization

Using the Ingestion Pipeline

The complete ingestion pipeline is now available via command-line interface:

# Ingest a single repository
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo owner/repo-name"

# Ingest multiple repositories
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo owner/repo1 --repo owner/repo2 --repo owner/repo3"

# Clear existing data before ingestion
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo owner/repo-name --clear-first"

Testing the Ingestion Pipeline

You can test the implemented components:

# Run all tests
docker compose exec backend pytest -v

# Run all ingestion tests
docker compose exec backend pytest tests/ingestion/ -v

# Run end-to-end integration tests
docker compose exec backend pytest tests/ingestion/test_e2e.py -m e2e -v

# Test individual components
docker compose exec backend pytest tests/ingestion/test_file_fetcher.py -v
docker compose exec backend pytest tests/ingestion/test_text_chunker.py -v
docker compose exec backend pytest tests/ingestion/test_embedding_processor.py -v
docker compose exec backend pytest tests/ingestion/test_db_writer.py -v
docker compose exec backend pytest tests/ingestion/test_orchestration.py -v

# Test embedder fallback logic
docker compose exec backend pytest tests/unit/test_embedder_fallback.py -v

# Manual component testing
docker compose exec backend python scripts/test_file_fetcher.py
docker compose exec backend python scripts/test_text_chunker.py

Test Results:

Comprehensive test coverage across all ingestion components
Unit, integration, and E2E tests included
All tests passing with high coverage

Configuration

Add to your .env file:

# GitHub API access token
GITHUB_TOKEN=ghp_your_github_token_here

CLI Usage

The orchestration pipeline provides a complete command-line interface:

# Basic usage - ingest a repository
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo anthropics/anthropic-sdk-python"

# Ingest multiple repositories at once
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo fastapi/fastapi --repo pydantic/pydantic --repo psycopg/psycopg"

# Clear all existing data before ingestion
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo owner/repo --clear-first"

# The script will:
# 1. Fetch all files from the repository
# 2. Chunk them into semantic pieces
# 3. Generate embeddings for each chunk
# 4. Store them in PostgreSQL with pgvector
# 5. Display progress and summary statistics

Python API Usage

You can also use the components programmatically:

from scripts.ingestion.github_client import GitHubClient
from scripts.ingestion.file_fetcher import FileFetcher
from scripts.ingestion.text_chunker import TextChunker
from scripts.ingestion.embedding_processor import EmbeddingProcessor
from scripts.ingestion.db_writer import DatabaseWriter

# Initialize components
client = GitHubClient()
fetcher = FileFetcher(include_markdown=True, include_code=True)
chunker = TextChunker(target_chunk_size=1500)
processor = EmbeddingProcessor()
writer = DatabaseWriter()

# Process repository
repo = client.get_repository("owner/repo-name")
files = fetcher.fetch_repository_files(repo)

all_chunks = []
for file in files:
    chunks = chunker.chunk_file(file)
    all_chunks.extend(chunks)

embedded_chunks = processor.embed_chunks(all_chunks)
result = writer.write_repository("owner/repo", embedded_chunks)

print(f"Successfully ingested {result['inserted_count']} chunks")
writer.close()

Query Refinement

The system includes LLM-powered query refinement to improve retrieval quality:

Automatic Enhancement: Ambiguous queries are clarified before retrieval
Context Preservation: User intent maintained while adding specificity
Configurable: Enable/disable via REFINE_ENABLED flag
Multi-LLM Support: Works with OpenAI GPT-4o-mini or Gemini

Usage: Query refinement happens automatically when enabled. View refined queries in session inspector.

Configuration:

REFINE_ENABLED=true

Quality Assessment

Automatic quality assessment provides multi-signal scoring:

Signals

Retrieval Confidence: Vector similarity scores
Coherence: Answer structure and completeness
Source Agreement: Consistency across retrieved chunks
Historical Similarity: Comparison with past successful answers
Completeness: Coverage of query aspects

Active Learning

Smart sample selection for human review
Confidence-based prioritization
Batch review interface in admin panel
Automatic labeling for high-confidence cases

Configuration:

AUTO_QUALITY_ASSESSMENT_ENABLED=true
ACTIVE_LEARNING_ENABLED=true
ACTIVE_LEARNING_BUDGET=50
CANNOT_ANSWER_MIN_CONFIDENCE=0.5
AUTO_LABEL_CONFIDENCE_THRESHOLD=0.8

Performance Optimization

Vector Reindexing

Automatic vector index reindexing for optimal query performance:

Monitors query performance metrics
Automatically triggers reindex when degradation detected
Zero-downtime reindexing
Configurable via REINDEX_ENABLED flag

Caching

React Query caching on frontend
Session prefetching for faster navigation
Optimistic updates for better UX

Monitoring

Request ID tracking for debugging
Structured JSON logging (production)
Performance metrics in admin dashboard
Benchmark tracking for continuous monitoring

Configuration:

REINDEX_ENABLED=true
BENCHMARKING_ENABLED=true
LOG_JSON=true  # For production
LOG_LEVEL=INFO

Contributing

Development Workflow

Fork the repository
Create a feature branch: git checkout -b feature/your-feature
Make your changes
Write/update tests
Run tests and linting: docker compose exec backend pytest && black app/
Commit: git commit -m "Add your feature"
Push: git push origin feature/your-feature
Create a Pull Request

Code Style

Follow PEP 8 guidelines
Use type hints for all function parameters
Write docstrings for public functions
Keep functions focused and single-purpose
Add tests for new features

License

MIT License - See LICENSE file for details

Support

User Manual: USER_MANUAL.md - Complete testing guide
Dynamic Training:
- Quick Start - 5-minute guide
- Full Manual - Complete reference
Issues: Open an issue on GitHub
Questions: Use GitHub Discussions

Acknowledgments

FastAPI for the excellent web framework
PostgreSQL pgvector for vector similarity search
OpenAI for embeddings (text-embedding-3-small) and LLM APIs
Google for the Gemini LLM API
Sentence Transformers as an alternative embedding provider

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
docs		docs
frontend		frontend
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.ruff.toml		.ruff.toml
README.md		README.md
docker-compose.override.yml.example		docker-compose.override.yml.example
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json

Folders and files

Latest commit

History

Repository files navigation

Cloudvelous Chat Assistant

Features

Backend

Frontend

Architecture

Tech Stack

Key Components

Quick Start

Prerequisites

Installation

How to Generate a GitHub Personal Access Token

Usage

Option 1: Web Interface (Recommended)

Option 2: API Endpoints (Direct)

Ask a Question

Submit Feedback

View Admin Sessions

Get Performance Metrics

Dynamic Training

Quality Assessment

Workflow Templates

Complete API Documentation

Project Structure

Development

Local Development Setup

Managing Services

Database Operations

Testing

Code Quality

Configuration

Core Settings

Embedding Settings

Feature Flags

Retrieval Settings

Workflow Learning

Quality Assessment Settings

Security Settings

Monitoring & Debugging

Viewing Logs

Health Checks

Performance Monitoring

Troubleshooting

Common Issues

GitHub Integration

Status

Features

Incremental Updates

Using the Ingestion Pipeline

Testing the Ingestion Pipeline

Configuration

CLI Usage

Python API Usage

Query Refinement

Quality Assessment

Signals

Active Learning

Performance Optimization

Vector Reindexing

Caching

Monitoring

Contributing

Development Workflow

Code Style

License

Support

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Packages