Skip to content

sl-cloud/cloudvelous-engineer

Repository files navigation

Cloudvelous Chat Assistant

An intelligent chatbot powered by advanced RAG (Retrieval-Augmented Generation) with agentic workflows, continuous learning, and production-ready quality assessment. Built with FastAPI, PostgreSQL (pgvector), React, and multi-LLM integration.

Features

Backend

  • RAG Architecture: Semantic search over documentation using pgvector vector database
  • Multi-LLM Support: OpenAI GPT-4o-mini and Google Gemini 1.5 Flash integration with runtime switching
  • Production Embeddings: OpenAI text-embedding-3-small (1536-dim) for high-quality semantic search
  • Query Refinement: LLM-powered intelligent query enhancement for better retrieval
  • Curated Q&A System: Priority layer for instant answers to common questions (80% similarity threshold)
  • Agentic Workflows: Multi-step reasoning with learned templates (search, validate, refine, synthesize)
  • Workflow Templates: Automatic learning and reuse of successful reasoning patterns
  • Contextual Learning: Query-specific chunk boosting to prevent cross-category contamination
  • Quality Assessment: Multi-signal quality scoring with auto-assessment
  • Active Learning: Smart sample selection for efficient human review
  • Tool Execution: Flexible tool calling system for extended capabilities
  • Gap Detection: Automatic identification of missing knowledge and capabilities
  • Vector Reindexing: Automatic index optimization for performance
  • Admin Training Interface: Review sessions, provide feedback, and analyze performance metrics
  • Dynamic Training Mode: Interactive chat-based refinement of answers, workflows, and knowledge chunks
  • Self-Improving: Automatically adjusts retrieval accuracy weights based on user feedback
  • GitHub Integration: Production-ready automated repository documentation ingestion

Frontend

  • Public Chat Interface: Beautiful React UI with syntax highlighting and source attribution
  • Admin Dashboard: Live statistics, LLM performance analytics, and system metrics
  • Session Management: Advanced filtering, search, sorting, pagination, and export (CSV/JSON)
  • Dynamic Training Interface: Interactive chat-based training with gap detection and auto-implementation
  • Training System: Quick review mode, bulk operations, keyboard shortcuts, undo functionality
  • Workflow Visualization: Step-by-step workflow execution trace with chunk retrieval details
  • Template Management: Browse, search, and manage workflow templates
  • Inspector Tools: Deep session analysis and comparison capabilities
  • Secure Authentication: API key authentication with protected admin routes
  • Real-time Updates: React Query for efficient data fetching and optimistic updates
  • Responsive Design: Mobile-friendly interface with Tailwind CSS
  • Accessibility: ARIA labels, keyboard navigation, and screen reader support

Architecture

Tech Stack

Backend:

  • Framework: FastAPI (Python 3.11+) with async support
  • Database: PostgreSQL 16 with pgvector extension for vector similarity search
  • Embeddings: OpenAI text-embedding-3-small (1536-dim) for production use
    • Note: Sentence Transformers (384-dim) supported but requires database migration
  • LLM Providers: OpenAI GPT-4o-mini (default), Google Gemini 1.5 Flash
  • Deployment: Docker Compose for local development and production

Frontend:

  • Framework: React 19 with TypeScript
  • Build Tool: Vite 7 for fast development and optimized builds
  • Styling: Tailwind CSS 3 for responsive design
  • State Management: TanStack React Query 5 (server state) + Zustand 5 (client state)
  • Routing: React Router 7 with protected routes
  • Validation: Zod 3.25 for runtime type safety
  • HTTP Client: Axios 1.13 with interceptors
  • Date Handling: date-fns 4.1 for date formatting
  • Code Display: React Syntax Highlighter 16.1 for code blocks
  • Testing: Vitest 4.0 + React Testing Library 16.3

Key Components

  1. Embedding Service: Converts text to vector embeddings for semantic search
  2. Retrieval Service: Finds relevant knowledge chunks using vector similarity
  3. Generator Service: Generates responses using retrieved context and LLM
  4. Workflow Learner: Learns from successful query patterns to boost future retrievals
  5. Training System: Collects feedback and adjusts accuracy weights
  6. Quality Assessor: Multi-signal quality scoring and auto-assessment
  7. Active Learner: Smart sample selection for human review
  8. Query Refiner: LLM-powered query enhancement
  9. Template Manager: Workflow template CRUD and scoring
  10. Action Executor: Multi-step workflow execution engine
  11. Inspector Service: Session analysis and comparison tools
  12. Dynamic Training Orchestrator: Interactive training chat coordination

Quick Start

Prerequisites

  • Docker and Docker Compose
  • Git
  • At least 8GB RAM available for Docker (OpenAI embeddings + LLM operations)
  • API keys for OpenAI and/or Google Gemini

Installation

  1. Clone the repository

    git clone <repository-url>
    cd cloudvelous-chatbot
  2. Configure environment variables

    cp .env.example .env

    Edit .env and set required values:

    # Database
    POSTGRES_PASSWORD=your_secure_password
    
    # LLM Provider (choose one or both)
    OPENAI_API_KEY=sk-your-openai-key
    GEMINI_API_KEY=your-gemini-key
    LLM_PROVIDER=openai  # or gemini
    
    # Embedding Configuration
    # Default provider is "openai" (1536-dim, text-embedding-3-small).
    # Sentence-transformers (384-dim) is supported but requires a database migration.
    # EMBED_PROVIDER=openai
    
    # Feature Flags (all default to true in config.py; override here if needed)
    # CURATED_QA_ENABLED=true
    # AGENTIC_WORKFLOW_ENABLED=true
    # CONTEXTUAL_BOOST_ENABLED=true
    # AUTO_QUALITY_ASSESSMENT_ENABLED=true
    # ACTIVE_LEARNING_ENABLED=true
    # BENCHMARKING_ENABLED=true
    # REINDEX_ENABLED=true
    # DEBUG_WORKFLOW=false
    # REFINE_ENABLED=true
    
    # Agentic V1 Smart Review (Phase A1)
    AGENTIC_V1_SMART_REVIEW=true
    
    # GitHub Integration
    GITHUB_TOKEN=ghp_your-github-token
    
    # Security (generate with: openssl rand -hex 32)
    ADMIN_JWT_SECRET=your-jwt-secret-min-32-chars
    ADMIN_API_KEY=your-api-key

    How to Generate a GitHub Personal Access Token

    Step-by-Step Instructions:

    1. Go to GitHub Settings

      • Navigate to https://github.com/settings/tokens
      • Or: Click your profile picture → Settings → Developer settings (left sidebar) → Personal access tokens → Tokens (classic)
    2. Generate New Token

      • Click "Generate new token" → "Generate new token (classic)"
    3. Configure Token

      • Note/Name: Give it a descriptive name like Cloudvelous Chatbot - Local Dev
      • Expiration: Choose an expiration period (recommend 90 days for development)
    4. Select Scopes/Permissions

      For this project, you'll need:

      • repo (Full control of private repositories) - if accessing private repos
      • public_repo (Access public repositories) - if only accessing public repos
      • read:org (Read org and team membership) - if working with organization repos

      Minimum required: Just public_repo if you're only ingesting public documentation

    5. Generate and Copy

      • Click "Generate token" at the bottom
      • ⚠️ IMPORTANT: Copy the token immediately! You won't be able to see it again
      • The token will start with ghp_
    6. Add Token to Your Project

      • Add the token to your .env file as GITHUB_TOKEN=ghp_your_token_here
  3. Start all services

    docker compose up -d
  4. Verify services are running

    docker compose ps

    Expected output:

    NAME                            STATUS          PORTS
    cloudvelous-chatbot-db          Up (healthy)    0.0.0.0:5432->5432/tcp
    cloudvelous-chatbot-backend     Up              0.0.0.0:8000->8000/tcp
    cloudvelous-chatbot-frontend    Up              0.0.0.0:5173->5173/tcp
    cloudvelous-chatbot-postadmin   Up              0.0.0.0:5050->80/tcp
    
  5. Check API health

    curl http://localhost:8000/health

    Expected response:

    {"status":"healthy","version":"0.4.0","phase":"3"}
  6. Access the API documentation

  7. Access the Frontend (Included in Docker Compose)

    The frontend starts automatically with docker compose up. Access:

    Admin Login:

    • Use the ADMIN_API_KEY from your .env file

    Development Mode (separate from Docker):

    cd frontend
    npm install
    npm run dev

Usage

Option 1: Web Interface (Recommended)

The easiest way to use Cloudvelous is through the web interface. See Frontend Web Interface in the User Manual for detailed instructions.

  1. For End Users (Public Chat):

    • Navigate to http://localhost:5173/
    • Type your questions in the chat input
    • View responses with source attribution
    • No authentication required
  2. For Admins (Sessions & Training):

    • Navigate to http://localhost:5173/admin/login
    • Enter your admin API key (from .env)
    • Access session management and training interface
    • Manage, search, and export chat sessions

Option 2: API Endpoints (Direct)

Ask a Question

curl -X POST http://localhost:8000/api/ask \
  -H "Content-Type: application/json" \
  -d '{
    "query": "How do I implement authentication?",
    "llm_provider": "openai",
    "llm_model": "gpt-4o-mini"
  }'

Submit Feedback

curl -X POST http://localhost:8000/api/train \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": 123,
    "is_correct": true,
    "chunk_feedback": [
      {"chunk_id": 1, "was_useful": true}
    ]
  }'

View Admin Sessions

curl -X POST http://localhost:8000/api/admin/sessions \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-admin-api-key" \
  -d '{
    "skip": 0,
    "limit": 10,
    "feedback_status": "pending"
  }'

Get Performance Metrics

curl -X GET http://localhost:8000/api/admin/stats \
  -H "X-API-Key: your-admin-api-key"

Dynamic Training

# Start dynamic training session
curl -X POST http://localhost:8000/api/admin/training/dynamic/session \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-admin-api-key" \
  -d '{
    "mode": "edit",
    "session_id": 123
  }'

# Send chat message in training
curl -X POST http://localhost:8000/api/admin/training/dynamic/chat \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-admin-api-key" \
  -d '{
    "training_session_id": "uuid",
    "message": "Add more information about authentication"
  }'

Quality Assessment

# Get quality signals for a session
curl -X GET http://localhost:8000/api/quality/signals/{session_id} \
  -H "X-API-Key: your-admin-api-key"

Workflow Templates

# List workflow templates
curl -X POST http://localhost:8000/api/admin/templates/list \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-admin-api-key" \
  -d '{"skip": 0, "limit": 20}'

# Get template details
curl -X GET http://localhost:8000/api/admin/templates/{template_id} \
  -H "X-API-Key: your-admin-api-key"

Complete API Documentation

Visit http://localhost:8000/docs for interactive API documentation with:

  • All available endpoints
  • Request/response schemas
  • Try-it-out functionality
  • Authentication requirements

Project Structure

cloudvelous-chatbot/
├── backend/
│   ├── app/
│   │   ├── main.py              # FastAPI application entry point
│   │   ├── config.py            # Configuration management (pydantic-settings)
│   │   ├── database.py          # Database connection & session
│   │   ├── exceptions.py        # Custom exception classes
│   │   ├── routers/             # API route handlers
│   │   │   ├── chat.py          # /api/ask endpoint
│   │   │   ├── training.py      # /api/train endpoint
│   │   │   ├── admin.py         # /api/admin/* endpoints
│   │   │   ├── dynamic_training.py  # /api/admin/training/dynamic/*
│   │   │   ├── workflows.py     # /api/workflows/* endpoints
│   │   │   ├── quality.py       # /api/quality/* endpoints
│   │   │   ├── admin_templates.py   # /api/admin/templates/*
│   │   │   ├── inspector.py     # /api/embedding-inspector/*
│   │   │   ├── iterative_sessions.py  # /api/sessions/iterative/*
│   │   │   ├── workflow_candidates.py # /api/workflows/candidates/*
│   │   │   ├── offline_training.py    # /api/workflows/offline/*
│   │   │   ├── v2_metrics.py    # /api/v2-metrics/* endpoints
│   │   │   ├── phase_a_admin.py # /api/admin/phase-a/*
│   │   │   └── phase_b_admin.py # /api/admin/phase-b/*
│   │   ├── services/            # Business logic layer
│   │   │   ├── embedder.py      # Text embedding generation
│   │   │   ├── retriever.py     # Semantic search & ranking
│   │   │   ├── generator.py     # LLM response generation
│   │   │   ├── workflow_learner.py      # Pattern learning
│   │   │   ├── workflow_planner.py      # Multi-step planning
│   │   │   ├── workflow_regenerator.py  # Dynamic workflow regeneration
│   │   │   ├── workflow_adapter.py      # Workflow adaptation
│   │   │   ├── workflow_distiller.py    # Workflow distillation
│   │   │   ├── action_executor.py       # Workflow execution
│   │   │   ├── template_learner.py      # Template management
│   │   │   ├── workflow_tracer.py       # Reasoning capture
│   │   │   ├── workflow_serializer.py   # Workflow serialization
│   │   │   ├── session_orchestrator.py  # Session orchestration
│   │   │   ├── contextual_boost.py      # Query-specific learning
│   │   │   ├── quality_metrics.py       # Quality calculation
│   │   │   ├── auto_quality_assessor.py # Auto-quality scoring
│   │   │   ├── active_learner.py        # Smart sample selection
│   │   │   ├── benchmark_tracker.py     # Performance tracking
│   │   │   ├── feedback_analyzer.py     # Feedback analysis
│   │   │   ├── missing_capability_detector.py  # Gap detection
│   │   │   ├── cannot_answer_detector.py  # Cannot-answer detection
│   │   │   ├── reindex_service.py       # Vector reindexing
│   │   │   ├── dynamic_training_service.py    # Interactive training
│   │   │   ├── training_chat_orchestrator.py  # Chat coordination
│   │   │   ├── inspector_service.py     # Session analysis
│   │   │   ├── template_admin_service.py  # Template management
│   │   │   ├── admin_service.py         # Admin operations
│   │   │   ├── v2_metrics.py            # V2 metrics service
│   │   │   ├── shadow_mode.py           # Shadow mode (V1/V2 comparison)
│   │   │   ├── phase_a_metrics.py       # Phase A observability
│   │   │   ├── phase_b_regenerator_eval.py  # Phase B evaluation
│   │   │   └── debug_workflow_logger.py # Workflow debug logging
│   │   ├── models/              # SQLAlchemy ORM models
│   │   │   ├── embeddings.py    # KnowledgeChunk model
│   │   │   ├── training_sessions.py     # TrainingSession model
│   │   │   ├── embedding_links.py       # Chunk-session links
│   │   │   ├── feedback.py              # TrainingFeedback model
│   │   │   ├── curated_qa.py            # CuratedQA model
│   │   │   ├── qa_suggestions.py        # QASuggestion model
│   │   │   ├── workflow_templates.py    # WorkflowTemplate model
│   │   │   ├── workflow_actions.py      # WorkflowAction model
│   │   │   ├── workflow_executions.py   # WorkflowExecution model
│   │   │   ├── workflow_execution_steps.py  # Step results
│   │   │   ├── workflow_step_chunks.py  # Chunks per step
│   │   │   ├── workflow_categories.py   # Taxonomy
│   │   │   ├── workflow_vectors.py      # Workflow embeddings
│   │   │   ├── workflow_candidates.py   # Workflow candidate tracking
│   │   │   ├── required_tools.py        # Tool tracking
│   │   │   ├── improvement_tasks.py     # Action items
│   │   │   ├── answer_quality_signals.py    # Quality signals
│   │   │   ├── active_learning_queue.py     # Learning queue
│   │   │   ├── benchmark_results.py         # Performance tracking
│   │   │   ├── learning_experiments.py      # A/B testing
│   │   │   ├── dynamic_training_sessions.py # Training sessions
│   │   │   ├── reindex_state.py         # Reindex state tracking
│   │   │   ├── conversation_context.py  # Conversation context
│   │   │   ├── shadow_comparison.py     # Shadow mode comparisons
│   │   │   ├── iterative_sessions.py    # Iterative sessions
│   │   │   ├── session_feedback.py      # Session feedback items
│   │   │   ├── offline_experiments.py   # Offline experiment data
│   │   │   ├── phase_a_metrics.py       # Phase A observability models
│   │   │   └── phase_b_regenerator_metrics.py  # Phase B eval models
│   │   ├── schemas/             # Pydantic request/response schemas
│   │   ├── llm/                 # LLM provider implementations
│   │   │   ├── base.py          # ILLMProvider interface
│   │   │   ├── factory.py       # Provider factory pattern
│   │   │   ├── openai_provider.py   # OpenAI integration (gpt-4o-mini)
│   │   │   └── gemini_provider.py   # Google Gemini integration (1.5-flash)
│   │   ├── training/            # Training & optimization
│   │   │   └── optimizer.py     # Accuracy weight optimizer
│   │   ├── middleware/          # Authentication & middleware
│   │   │   └── auth.py          # JWT & API key authentication
│   │   └── utils/               # Logging, rate limiting, utilities
│   ├── tests/                   # Unit and integration tests
│   │   ├── unit/                # Unit tests
│   │   ├── integration/         # Integration tests
│   │   └── ingestion/           # Ingestion pipeline tests
│   ├── alembic/                 # Database migrations
│   │   └── versions/            # Migration scripts
│   ├── requirements.txt         # Python dependencies
│   ├── pytest.ini               # Test configuration
│   └── Dockerfile               # Backend container image
├── scripts/
│   ├── init-db.sql              # Database initialization SQL
│   ├── initial_ingestion.py     # Data ingestion script
│   ├── manual_retrain.py        # Manual retraining trigger
│   └── ingestion/               # GitHub ingestion pipeline
│       ├── github_client.py     # GitHub API client
│       ├── file_fetcher.py      # Repository file fetcher
│       ├── text_chunker.py      # Semantic text chunking
│       ├── embedding_processor.py  # Embedding generation
│       └── db_writer.py         # Database storage (PostgreSQL/pgvector)
├── frontend/
│   ├── src/
│   │   ├── api/                 # API service layer
│   │   │   ├── client.ts        # Axios client configuration
│   │   │   ├── queryClient.ts   # TanStack React Query client
│   │   │   ├── admin.service.ts     # Admin API calls
│   │   │   ├── chat.service.ts      # Chat API calls
│   │   │   ├── training.service.ts  # Training API calls
│   │   │   ├── workflow.service.ts  # Workflow API calls
│   │   │   ├── health.service.ts    # Health check API calls
│   │   │   └── dynamic-training.service.ts  # Dynamic training API
│   │   ├── components/
│   │   │   ├── admin/
│   │   │   │   ├── DynamicTrainingChat.tsx
│   │   │   │   ├── DynamicTrainingQuickGuide.tsx
│   │   │   │   ├── WorkflowTrace.tsx
│   │   │   │   ├── WorkflowGapIndicator.tsx
│   │   │   │   ├── SessionInspectorModal.tsx
│   │   │   │   ├── SessionTable.tsx
│   │   │   │   ├── SessionRowActions.tsx
│   │   │   │   ├── QuickReviewCard.tsx
│   │   │   │   ├── FeedbackForm.tsx
│   │   │   │   ├── ChunkList.tsx
│   │   │   │   ├── ChunkCard.tsx
│   │   │   │   ├── ChunkSelectorModal.tsx
│   │   │   │   ├── ImprovementSuggestionsPanel.tsx
│   │   │   │   ├── CustomSuggestionModal.tsx
│   │   │   │   ├── Pagination.tsx
│   │   │   │   └── KeyboardShortcutsHelp.tsx
│   │   │   ├── chat/
│   │   │   │   ├── ChatInterface.tsx
│   │   │   │   ├── MessageList.tsx
│   │   │   │   ├── Message.tsx
│   │   │   │   └── MessageInput.tsx
│   │   │   ├── auth/
│   │   │   │   └── ProtectedRoute.tsx
│   │   │   └── ui/              # Reusable UI components
│   │   │       ├── Dialog.tsx
│   │   │       ├── Modal.tsx
│   │   │       ├── LoadingSpinner.tsx
│   │   │       └── EmptyState.tsx
│   │   ├── pages/               # Page components
│   │   │   ├── AdminLoginPage.tsx
│   │   │   ├── SessionsPage.tsx
│   │   │   ├── TrainingPage.tsx
│   │   │   └── NotFoundPage.tsx
│   │   ├── hooks/               # Custom React hooks
│   │   │   ├── useChat.ts
│   │   │   ├── useSession.ts
│   │   │   ├── useInspector.ts
│   │   │   ├── useSuggestions.ts
│   │   │   └── useKeyboardShortcuts.ts
│   │   ├── store/               # Zustand stores
│   │   │   └── chatStore.ts
│   │   ├── contexts/            # React contexts
│   │   │   └── AuthContext.tsx   # Auth context with useAuth hook
│   │   ├── config/              # Configuration
│   │   │   ├── api.config.ts
│   │   │   └── app.config.ts
│   │   ├── types/               # TypeScript types
│   │   ├── utils/               # Utility functions
│   │   ├── App.tsx              # Root component with routing
│   │   └── main.tsx             # Entry point
│   ├── public/                  # Static assets
│   ├── Dockerfile               # Frontend container image
│   ├── vite.config.ts           # Vite configuration
│   ├── tailwind.config.js       # Tailwind configuration
│   └── package.json             # Dependencies
├── docker-compose.yml           # Docker Compose (4 services)
├── .env.example                 # Environment variable template
├── USER_MANUAL.md               # Complete user guide with testing
├── DYNAMIC_TRAINING_MANUAL.md   # Dynamic training reference
├── QUICK_START_DYNAMIC_TRAINING.md  # 5-minute guide
└── README.md                    # This file

Development

Local Development Setup

# Install backend dependencies locally (optional)
cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

# Run locally without Docker
python -m uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Managing Services

# Start all services
docker compose up -d

# Start specific service
docker compose up -d backend

# Stop all services
docker compose down

# Stop without removing volumes
docker compose stop

# Restart service
docker compose restart backend

# Rebuild after code changes
docker compose up -d --build backend

# View logs
docker compose logs -f backend

# View resource usage
docker compose stats

# Clean up everything (including volumes)
docker compose down -v

Database Operations

# Access PostgreSQL shell
docker compose exec postgres psql -U chatbot_user -d cloudvelous_chatbot

# Common SQL commands
\dt          # List tables
\d tablename # Describe table
SELECT COUNT(*) FROM knowledge_chunks;  # Count chunks

# Run migrations
docker compose exec backend alembic upgrade head

# Create new migration
docker compose exec backend alembic revision --autogenerate -m "Add new column"

# Rollback last migration
docker compose exec backend alembic downgrade -1

# View migration history
docker compose exec backend alembic history

Testing

The project has comprehensive test coverage across unit, integration, and E2E test levels.

Test Categories:

  • Unit tests: Test individual components in isolation (default, always run)
  • Integration tests: Test component interactions with mocked dependencies (requires -m integration)
  • E2E tests: Test complete workflows with real PostgreSQL database (requires -m e2e and environment setup)
# Run unit tests only (default)
docker compose exec backend pytest

# Run all tests including integration tests
docker compose exec backend pytest -m "unit or integration"

# Run specific test categories
docker compose exec backend pytest -m unit          # Unit tests only
docker compose exec backend pytest -m integration   # Integration tests only
docker compose exec backend pytest -m e2e          # E2E tests only (requires setup)

# Run tests for specific features
docker compose exec backend pytest tests/unit/test_embedder.py
docker compose exec backend pytest tests/integration/test_training_router.py
docker compose exec backend pytest tests/ingestion/ -v

# Run with coverage report
docker compose exec backend pytest --cov=app --cov-report=html
open htmlcov/index.html  # View coverage report

# Run tests matching pattern
docker compose exec backend pytest -k "test_retrieval"

# Run with verbose output
docker compose exec backend pytest -v

E2E Testing (Real Database):

E2E tests verify actual database behavior with PostgreSQL + pgvector:

# Set up E2E test environment
export TEST_DATABASE_URL="postgresql://user:pass@localhost:5432/testdb"
export RUN_E2E_TESTS=true

# Run E2E tests for training feedback
docker compose exec backend pytest tests/integration/test_training_e2e.py -m e2e -v

# Run specific E2E test
docker compose exec backend pytest tests/integration/test_training_e2e.py::TestTrainingFeedbackE2E::test_submit_feedback_updates_was_useful_flags_in_real_db -xvs

Test Coverage:

  • Unit tests: 150+ tests
  • Integration tests: 50+ tests
  • E2E tests: 20+ tests
  • Frontend tests: Component and hook tests
  • Total: 220+ tests with comprehensive coverage

Code Quality

# Format code with Black
docker compose exec backend black app/

# Lint with Flake8
docker compose exec backend flake8 app/

# Type checking with MyPy
docker compose exec backend mypy app/

# Run all quality checks
docker compose exec backend sh -c "black app/ && flake8 app/ && mypy app/"

Configuration

All configuration is managed through environment variables defined in .env.

Core Settings

Variable Description Default Required
POSTGRES_PASSWORD PostgreSQL password - Yes
DATABASE_URL Full database connection string Auto-generated No
OPENAI_API_KEY OpenAI API key - If using OpenAI
GEMINI_API_KEY Gemini API key - If using Gemini
GITHUB_TOKEN GitHub API access token - For ingestion
LLM_PROVIDER Default LLM provider openai No

Embedding Settings

Variable Description Default Required
EMBED_PROVIDER Embedding provider openai Yes
OPENAI_EMBEDDING_MODEL OpenAI model name text-embedding-3-small If using OpenAI
OPENAI_EMBEDDING_DIMENSIONS Embedding dimensions 1536 If using OpenAI
SENTENCE_TRANSFORMERS_MODEL Sentence transformers model all-MiniLM-L6-v2 If using sentence-transformers
SENTENCE_TRANSFORMERS_DIMENSIONS Embedding dimensions 384 If using sentence-transformers

Feature Flags

Variable Description Default
CURATED_QA_ENABLED Enable curated Q&A priority layer true
AGENTIC_WORKFLOW_ENABLED Enable multi-step workflows true
CONTEXTUAL_BOOST_ENABLED Enable query-specific learning true
AUTO_QUALITY_ASSESSMENT_ENABLED Enable auto-quality scoring true
ACTIVE_LEARNING_ENABLED Enable smart sample selection true
REFINE_ENABLED Enable LLM-based query refinement true
BENCHMARKING_ENABLED Enable performance tracking true
REINDEX_ENABLED Enable automatic reindexing true
DEBUG_WORKFLOW Enable detailed workflow tracing false

Retrieval Settings

Variable Description Default
TOP_K_RETRIEVAL Number of chunks to retrieve 5
WORKFLOW_BOOST_FACTOR Boost for workflow matches 1.2
CHUNK_WEIGHT_ADJUSTMENT_RATE Learning rate for weights 0.1
MIN_CHUNK_WEIGHT Minimum accuracy weight 0.5
MAX_CHUNK_WEIGHT Maximum accuracy weight 2.0

Workflow Learning

Variable Description Default
WORKFLOW_EMBEDDING_ENABLED Enable workflow learning true
WORKFLOW_SIMILARITY_WEIGHT Weight for workflow similarity 0.3
FEEDBACK_THRESHOLD_FOR_RETRAIN Feedback count trigger 50
MIN_WORKFLOW_CLUSTER_SIZE Minimum cluster size 3

Quality Assessment Settings

Variable Description Default
CANNOT_ANSWER_MIN_CONFIDENCE Minimum confidence for quality 0.5
AUTO_LABEL_CONFIDENCE_THRESHOLD Threshold for auto-labeling 0.8
ACTIVE_LEARNING_BUDGET Reviews per day 50

Security Settings

Variable Description Required
ADMIN_JWT_SECRET JWT signing secret (min 32 chars) Yes
ADMIN_API_KEY Admin API authentication key Yes
JWT_ALGORITHM JWT signing algorithm No (HS256)
JWT_EXPIRATION_HOURS Token expiry time (hours) No (24)

Monitoring & Debugging

Viewing Logs

# Follow all logs
docker compose logs -f

# Follow specific service
docker compose logs -f backend

# Last 100 lines
docker compose logs --tail=100 backend

# Export logs to file
docker compose logs > logs.txt

# Search logs for errors
docker compose logs backend | grep -i error

Health Checks

# Backend health
curl http://localhost:8000/health

# Database connection
docker compose exec backend python -c "
from app.models.database import SessionLocal
db = SessionLocal()
print('Connected:', db.execute('SELECT 1').scalar())
db.close()
"

# Check embedder
docker compose exec backend python -c "
from app.services.embedder import EmbeddingService
embedder = EmbeddingService()
print('Provider:', embedder.provider, '| Dimensions:', embedder.dimension)
"

Performance Monitoring

# Container resource usage
docker compose stats

# Database stats
docker compose exec postgres psql -U chatbot_user -d cloudvelous_chatbot -c "
SELECT
  schemaname,
  tablename,
  pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
"

# API endpoint stats (requires admin API key)
curl -H "X-API-Key: your-admin-api-key" \
  http://localhost:8000/api/admin/stats

Troubleshooting

Common Issues

Backend won't start - "ModuleNotFoundError"

# Rebuild container
docker compose up -d --build backend

Database connection refused

# Check PostgreSQL is running
docker compose ps postgres

# Restart PostgreSQL
docker compose restart postgres

# Wait a few seconds and restart backend
sleep 5
docker compose restart backend

Port already in use

# Find what's using the port
lsof -i :8000

# Kill the process or change port in docker-compose.yml

Slow query responses

# Check database indices
docker compose exec postgres psql -U chatbot_user -d cloudvelous_chatbot -c "\di"

# Check embedding count
docker compose exec postgres psql -U chatbot_user -d cloudvelous_chatbot -c \
  "SELECT COUNT(*) FROM knowledge_chunks;"

For more detailed troubleshooting, check the Docker logs and health endpoints described above.

GitHub Integration

The system includes a production-ready GitHub repository ingestion pipeline for automated documentation processing with comprehensive testing.

Status

The GitHub ingestion pipeline is production-ready with comprehensive testing:

  • ✅ Complete end-to-end ingestion workflow
  • ✅ OpenAI embeddings (1536-dim) in production use
  • ✅ Transaction management with rollback support
  • ✅ Comprehensive test coverage (220+ tests)
  • ✅ CLI interface for repository ingestion
  • ✅ Batch processing and error handling

Features

  • Repository file fetching with type filtering (markdown, code, docs)
  • Recursive directory traversal with skip patterns
  • Smart text chunking by markdown headers and paragraphs
  • Vector embedding generation with OpenAI text-embedding-3-small (1536-dim)
  • Database storage with PostgreSQL and pgvector extension
  • Transaction management with rollback on failure
  • Repository-level delete and re-insert strategy
  • Full orchestration pipeline with CLI (scripts/initial_ingestion.py)
  • Progress tracking and summary reporting
  • Error handling and partial failure recovery
  • End-to-end integration testing with real GitHub repositories
  • Support for batch processing and lazy model loading

Incremental Updates

For incremental repository updates with change detection:

  • Use the --clear-first flag to delete and re-ingest a repository
  • SHA-based change detection available for future optimization

Using the Ingestion Pipeline

The complete ingestion pipeline is now available via command-line interface:

# Ingest a single repository
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo owner/repo-name"

# Ingest multiple repositories
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo owner/repo1 --repo owner/repo2 --repo owner/repo3"

# Clear existing data before ingestion
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo owner/repo-name --clear-first"

Testing the Ingestion Pipeline

You can test the implemented components:

# Run all tests
docker compose exec backend pytest -v

# Run all ingestion tests
docker compose exec backend pytest tests/ingestion/ -v

# Run end-to-end integration tests
docker compose exec backend pytest tests/ingestion/test_e2e.py -m e2e -v

# Test individual components
docker compose exec backend pytest tests/ingestion/test_file_fetcher.py -v
docker compose exec backend pytest tests/ingestion/test_text_chunker.py -v
docker compose exec backend pytest tests/ingestion/test_embedding_processor.py -v
docker compose exec backend pytest tests/ingestion/test_db_writer.py -v
docker compose exec backend pytest tests/ingestion/test_orchestration.py -v

# Test embedder fallback logic
docker compose exec backend pytest tests/unit/test_embedder_fallback.py -v

# Manual component testing
docker compose exec backend python scripts/test_file_fetcher.py
docker compose exec backend python scripts/test_text_chunker.py

Test Results:

  • Comprehensive test coverage across all ingestion components
  • Unit, integration, and E2E tests included
  • All tests passing with high coverage

Configuration

Add to your .env file:

# GitHub API access token
GITHUB_TOKEN=ghp_your_github_token_here

CLI Usage

The orchestration pipeline provides a complete command-line interface:

# Basic usage - ingest a repository
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo anthropics/anthropic-sdk-python"

# Ingest multiple repositories at once
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo fastapi/fastapi --repo pydantic/pydantic --repo psycopg/psycopg"

# Clear all existing data before ingestion
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo owner/repo --clear-first"

# The script will:
# 1. Fetch all files from the repository
# 2. Chunk them into semantic pieces
# 3. Generate embeddings for each chunk
# 4. Store them in PostgreSQL with pgvector
# 5. Display progress and summary statistics

Python API Usage

You can also use the components programmatically:

from scripts.ingestion.github_client import GitHubClient
from scripts.ingestion.file_fetcher import FileFetcher
from scripts.ingestion.text_chunker import TextChunker
from scripts.ingestion.embedding_processor import EmbeddingProcessor
from scripts.ingestion.db_writer import DatabaseWriter

# Initialize components
client = GitHubClient()
fetcher = FileFetcher(include_markdown=True, include_code=True)
chunker = TextChunker(target_chunk_size=1500)
processor = EmbeddingProcessor()
writer = DatabaseWriter()

# Process repository
repo = client.get_repository("owner/repo-name")
files = fetcher.fetch_repository_files(repo)

all_chunks = []
for file in files:
    chunks = chunker.chunk_file(file)
    all_chunks.extend(chunks)

embedded_chunks = processor.embed_chunks(all_chunks)
result = writer.write_repository("owner/repo", embedded_chunks)

print(f"Successfully ingested {result['inserted_count']} chunks")
writer.close()

Query Refinement

The system includes LLM-powered query refinement to improve retrieval quality:

  • Automatic Enhancement: Ambiguous queries are clarified before retrieval
  • Context Preservation: User intent maintained while adding specificity
  • Configurable: Enable/disable via REFINE_ENABLED flag
  • Multi-LLM Support: Works with OpenAI GPT-4o-mini or Gemini

Usage: Query refinement happens automatically when enabled. View refined queries in session inspector.

Configuration:

REFINE_ENABLED=true

Quality Assessment

Automatic quality assessment provides multi-signal scoring:

Signals

  1. Retrieval Confidence: Vector similarity scores
  2. Coherence: Answer structure and completeness
  3. Source Agreement: Consistency across retrieved chunks
  4. Historical Similarity: Comparison with past successful answers
  5. Completeness: Coverage of query aspects

Active Learning

  • Smart sample selection for human review
  • Confidence-based prioritization
  • Batch review interface in admin panel
  • Automatic labeling for high-confidence cases

Configuration:

AUTO_QUALITY_ASSESSMENT_ENABLED=true
ACTIVE_LEARNING_ENABLED=true
ACTIVE_LEARNING_BUDGET=50
CANNOT_ANSWER_MIN_CONFIDENCE=0.5
AUTO_LABEL_CONFIDENCE_THRESHOLD=0.8

Performance Optimization

Vector Reindexing

Automatic vector index reindexing for optimal query performance:

  • Monitors query performance metrics
  • Automatically triggers reindex when degradation detected
  • Zero-downtime reindexing
  • Configurable via REINDEX_ENABLED flag

Caching

  • React Query caching on frontend
  • Session prefetching for faster navigation
  • Optimistic updates for better UX

Monitoring

  • Request ID tracking for debugging
  • Structured JSON logging (production)
  • Performance metrics in admin dashboard
  • Benchmark tracking for continuous monitoring

Configuration:

REINDEX_ENABLED=true
BENCHMARKING_ENABLED=true
LOG_JSON=true  # For production
LOG_LEVEL=INFO

Contributing

Development Workflow

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/your-feature
  3. Make your changes
  4. Write/update tests
  5. Run tests and linting: docker compose exec backend pytest && black app/
  6. Commit: git commit -m "Add your feature"
  7. Push: git push origin feature/your-feature
  8. Create a Pull Request

Code Style

  • Follow PEP 8 guidelines
  • Use type hints for all function parameters
  • Write docstrings for public functions
  • Keep functions focused and single-purpose
  • Add tests for new features

License

MIT License - See LICENSE file for details

Support

  • User Manual: USER_MANUAL.md - Complete testing guide
  • Dynamic Training:
  • Issues: Open an issue on GitHub
  • Questions: Use GitHub Discussions

Acknowledgments

  • FastAPI for the excellent web framework
  • PostgreSQL pgvector for vector similarity search
  • OpenAI for embeddings (text-embedding-3-small) and LLM APIs
  • Google for the Gemini LLM API
  • Sentence Transformers as an alternative embedding provider

About

AI Agent Engineer

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors