An intelligent chatbot powered by advanced RAG (Retrieval-Augmented Generation) with agentic workflows, continuous learning, and production-ready quality assessment. Built with FastAPI, PostgreSQL (pgvector), React, and multi-LLM integration.
- RAG Architecture: Semantic search over documentation using pgvector vector database
- Multi-LLM Support: OpenAI GPT-4o-mini and Google Gemini 1.5 Flash integration with runtime switching
- Production Embeddings: OpenAI text-embedding-3-small (1536-dim) for high-quality semantic search
- Query Refinement: LLM-powered intelligent query enhancement for better retrieval
- Curated Q&A System: Priority layer for instant answers to common questions (80% similarity threshold)
- Agentic Workflows: Multi-step reasoning with learned templates (search, validate, refine, synthesize)
- Workflow Templates: Automatic learning and reuse of successful reasoning patterns
- Contextual Learning: Query-specific chunk boosting to prevent cross-category contamination
- Quality Assessment: Multi-signal quality scoring with auto-assessment
- Active Learning: Smart sample selection for efficient human review
- Tool Execution: Flexible tool calling system for extended capabilities
- Gap Detection: Automatic identification of missing knowledge and capabilities
- Vector Reindexing: Automatic index optimization for performance
- Admin Training Interface: Review sessions, provide feedback, and analyze performance metrics
- Dynamic Training Mode: Interactive chat-based refinement of answers, workflows, and knowledge chunks
- Self-Improving: Automatically adjusts retrieval accuracy weights based on user feedback
- GitHub Integration: Production-ready automated repository documentation ingestion
- Public Chat Interface: Beautiful React UI with syntax highlighting and source attribution
- Admin Dashboard: Live statistics, LLM performance analytics, and system metrics
- Session Management: Advanced filtering, search, sorting, pagination, and export (CSV/JSON)
- Dynamic Training Interface: Interactive chat-based training with gap detection and auto-implementation
- Training System: Quick review mode, bulk operations, keyboard shortcuts, undo functionality
- Workflow Visualization: Step-by-step workflow execution trace with chunk retrieval details
- Template Management: Browse, search, and manage workflow templates
- Inspector Tools: Deep session analysis and comparison capabilities
- Secure Authentication: API key authentication with protected admin routes
- Real-time Updates: React Query for efficient data fetching and optimistic updates
- Responsive Design: Mobile-friendly interface with Tailwind CSS
- Accessibility: ARIA labels, keyboard navigation, and screen reader support
Backend:
- Framework: FastAPI (Python 3.11+) with async support
- Database: PostgreSQL 16 with pgvector extension for vector similarity search
- Embeddings: OpenAI text-embedding-3-small (1536-dim) for production use
- Note: Sentence Transformers (384-dim) supported but requires database migration
- LLM Providers: OpenAI GPT-4o-mini (default), Google Gemini 1.5 Flash
- Deployment: Docker Compose for local development and production
Frontend:
- Framework: React 19 with TypeScript
- Build Tool: Vite 7 for fast development and optimized builds
- Styling: Tailwind CSS 3 for responsive design
- State Management: TanStack React Query 5 (server state) + Zustand 5 (client state)
- Routing: React Router 7 with protected routes
- Validation: Zod 3.25 for runtime type safety
- HTTP Client: Axios 1.13 with interceptors
- Date Handling: date-fns 4.1 for date formatting
- Code Display: React Syntax Highlighter 16.1 for code blocks
- Testing: Vitest 4.0 + React Testing Library 16.3
- Embedding Service: Converts text to vector embeddings for semantic search
- Retrieval Service: Finds relevant knowledge chunks using vector similarity
- Generator Service: Generates responses using retrieved context and LLM
- Workflow Learner: Learns from successful query patterns to boost future retrievals
- Training System: Collects feedback and adjusts accuracy weights
- Quality Assessor: Multi-signal quality scoring and auto-assessment
- Active Learner: Smart sample selection for human review
- Query Refiner: LLM-powered query enhancement
- Template Manager: Workflow template CRUD and scoring
- Action Executor: Multi-step workflow execution engine
- Inspector Service: Session analysis and comparison tools
- Dynamic Training Orchestrator: Interactive training chat coordination
- Docker and Docker Compose
- Git
- At least 8GB RAM available for Docker (OpenAI embeddings + LLM operations)
- API keys for OpenAI and/or Google Gemini
-
Clone the repository
git clone <repository-url> cd cloudvelous-chatbot
-
Configure environment variables
cp .env.example .env
Edit
.envand set required values:# Database POSTGRES_PASSWORD=your_secure_password # LLM Provider (choose one or both) OPENAI_API_KEY=sk-your-openai-key GEMINI_API_KEY=your-gemini-key LLM_PROVIDER=openai # or gemini # Embedding Configuration # Default provider is "openai" (1536-dim, text-embedding-3-small). # Sentence-transformers (384-dim) is supported but requires a database migration. # EMBED_PROVIDER=openai # Feature Flags (all default to true in config.py; override here if needed) # CURATED_QA_ENABLED=true # AGENTIC_WORKFLOW_ENABLED=true # CONTEXTUAL_BOOST_ENABLED=true # AUTO_QUALITY_ASSESSMENT_ENABLED=true # ACTIVE_LEARNING_ENABLED=true # BENCHMARKING_ENABLED=true # REINDEX_ENABLED=true # DEBUG_WORKFLOW=false # REFINE_ENABLED=true # Agentic V1 Smart Review (Phase A1) AGENTIC_V1_SMART_REVIEW=true # GitHub Integration GITHUB_TOKEN=ghp_your-github-token # Security (generate with: openssl rand -hex 32) ADMIN_JWT_SECRET=your-jwt-secret-min-32-chars ADMIN_API_KEY=your-api-key
Step-by-Step Instructions:
-
Go to GitHub Settings
- Navigate to https://github.com/settings/tokens
- Or: Click your profile picture → Settings → Developer settings (left sidebar) → Personal access tokens → Tokens (classic)
-
Generate New Token
- Click "Generate new token" → "Generate new token (classic)"
-
Configure Token
- Note/Name: Give it a descriptive name like
Cloudvelous Chatbot - Local Dev - Expiration: Choose an expiration period (recommend 90 days for development)
- Note/Name: Give it a descriptive name like
-
Select Scopes/Permissions
For this project, you'll need:
- ✅
repo(Full control of private repositories) - if accessing private repos - ✅
public_repo(Access public repositories) - if only accessing public repos - ✅
read:org(Read org and team membership) - if working with organization repos
Minimum required: Just
public_repoif you're only ingesting public documentation - ✅
-
Generate and Copy
- Click "Generate token" at the bottom
⚠️ IMPORTANT: Copy the token immediately! You won't be able to see it again- The token will start with
ghp_
-
Add Token to Your Project
- Add the token to your
.envfile asGITHUB_TOKEN=ghp_your_token_here
- Add the token to your
-
-
Start all services
docker compose up -d
-
Verify services are running
docker compose ps
Expected output:
NAME STATUS PORTS cloudvelous-chatbot-db Up (healthy) 0.0.0.0:5432->5432/tcp cloudvelous-chatbot-backend Up 0.0.0.0:8000->8000/tcp cloudvelous-chatbot-frontend Up 0.0.0.0:5173->5173/tcp cloudvelous-chatbot-postadmin Up 0.0.0.0:5050->80/tcp -
Check API health
curl http://localhost:8000/health
Expected response:
{"status":"healthy","version":"0.4.0","phase":"3"} -
Access the API documentation
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
- pgAdmin: http://localhost:5050 (optional)
-
Access the Frontend (Included in Docker Compose)
The frontend starts automatically with
docker compose up. Access:- Public Chat: http://localhost:5173/
- Admin Login: http://localhost:5173/admin/login
- Session Management: http://localhost:5173/admin/sessions
- Training Interface: http://localhost:5173/admin/training
Admin Login:
- Use the
ADMIN_API_KEYfrom your.envfile
Development Mode (separate from Docker):
cd frontend npm install npm run dev
The easiest way to use Cloudvelous is through the web interface. See Frontend Web Interface in the User Manual for detailed instructions.
-
For End Users (Public Chat):
- Navigate to http://localhost:5173/
- Type your questions in the chat input
- View responses with source attribution
- No authentication required
-
For Admins (Sessions & Training):
- Navigate to http://localhost:5173/admin/login
- Enter your admin API key (from
.env) - Access session management and training interface
- Manage, search, and export chat sessions
curl -X POST http://localhost:8000/api/ask \
-H "Content-Type: application/json" \
-d '{
"query": "How do I implement authentication?",
"llm_provider": "openai",
"llm_model": "gpt-4o-mini"
}'curl -X POST http://localhost:8000/api/train \
-H "Content-Type: application/json" \
-d '{
"session_id": 123,
"is_correct": true,
"chunk_feedback": [
{"chunk_id": 1, "was_useful": true}
]
}'curl -X POST http://localhost:8000/api/admin/sessions \
-H "Content-Type: application/json" \
-H "X-API-Key: your-admin-api-key" \
-d '{
"skip": 0,
"limit": 10,
"feedback_status": "pending"
}'curl -X GET http://localhost:8000/api/admin/stats \
-H "X-API-Key: your-admin-api-key"# Start dynamic training session
curl -X POST http://localhost:8000/api/admin/training/dynamic/session \
-H "Content-Type: application/json" \
-H "X-API-Key: your-admin-api-key" \
-d '{
"mode": "edit",
"session_id": 123
}'
# Send chat message in training
curl -X POST http://localhost:8000/api/admin/training/dynamic/chat \
-H "Content-Type: application/json" \
-H "X-API-Key: your-admin-api-key" \
-d '{
"training_session_id": "uuid",
"message": "Add more information about authentication"
}'# Get quality signals for a session
curl -X GET http://localhost:8000/api/quality/signals/{session_id} \
-H "X-API-Key: your-admin-api-key"# List workflow templates
curl -X POST http://localhost:8000/api/admin/templates/list \
-H "Content-Type: application/json" \
-H "X-API-Key: your-admin-api-key" \
-d '{"skip": 0, "limit": 20}'
# Get template details
curl -X GET http://localhost:8000/api/admin/templates/{template_id} \
-H "X-API-Key: your-admin-api-key"Visit http://localhost:8000/docs for interactive API documentation with:
- All available endpoints
- Request/response schemas
- Try-it-out functionality
- Authentication requirements
cloudvelous-chatbot/
├── backend/
│ ├── app/
│ │ ├── main.py # FastAPI application entry point
│ │ ├── config.py # Configuration management (pydantic-settings)
│ │ ├── database.py # Database connection & session
│ │ ├── exceptions.py # Custom exception classes
│ │ ├── routers/ # API route handlers
│ │ │ ├── chat.py # /api/ask endpoint
│ │ │ ├── training.py # /api/train endpoint
│ │ │ ├── admin.py # /api/admin/* endpoints
│ │ │ ├── dynamic_training.py # /api/admin/training/dynamic/*
│ │ │ ├── workflows.py # /api/workflows/* endpoints
│ │ │ ├── quality.py # /api/quality/* endpoints
│ │ │ ├── admin_templates.py # /api/admin/templates/*
│ │ │ ├── inspector.py # /api/embedding-inspector/*
│ │ │ ├── iterative_sessions.py # /api/sessions/iterative/*
│ │ │ ├── workflow_candidates.py # /api/workflows/candidates/*
│ │ │ ├── offline_training.py # /api/workflows/offline/*
│ │ │ ├── v2_metrics.py # /api/v2-metrics/* endpoints
│ │ │ ├── phase_a_admin.py # /api/admin/phase-a/*
│ │ │ └── phase_b_admin.py # /api/admin/phase-b/*
│ │ ├── services/ # Business logic layer
│ │ │ ├── embedder.py # Text embedding generation
│ │ │ ├── retriever.py # Semantic search & ranking
│ │ │ ├── generator.py # LLM response generation
│ │ │ ├── workflow_learner.py # Pattern learning
│ │ │ ├── workflow_planner.py # Multi-step planning
│ │ │ ├── workflow_regenerator.py # Dynamic workflow regeneration
│ │ │ ├── workflow_adapter.py # Workflow adaptation
│ │ │ ├── workflow_distiller.py # Workflow distillation
│ │ │ ├── action_executor.py # Workflow execution
│ │ │ ├── template_learner.py # Template management
│ │ │ ├── workflow_tracer.py # Reasoning capture
│ │ │ ├── workflow_serializer.py # Workflow serialization
│ │ │ ├── session_orchestrator.py # Session orchestration
│ │ │ ├── contextual_boost.py # Query-specific learning
│ │ │ ├── quality_metrics.py # Quality calculation
│ │ │ ├── auto_quality_assessor.py # Auto-quality scoring
│ │ │ ├── active_learner.py # Smart sample selection
│ │ │ ├── benchmark_tracker.py # Performance tracking
│ │ │ ├── feedback_analyzer.py # Feedback analysis
│ │ │ ├── missing_capability_detector.py # Gap detection
│ │ │ ├── cannot_answer_detector.py # Cannot-answer detection
│ │ │ ├── reindex_service.py # Vector reindexing
│ │ │ ├── dynamic_training_service.py # Interactive training
│ │ │ ├── training_chat_orchestrator.py # Chat coordination
│ │ │ ├── inspector_service.py # Session analysis
│ │ │ ├── template_admin_service.py # Template management
│ │ │ ├── admin_service.py # Admin operations
│ │ │ ├── v2_metrics.py # V2 metrics service
│ │ │ ├── shadow_mode.py # Shadow mode (V1/V2 comparison)
│ │ │ ├── phase_a_metrics.py # Phase A observability
│ │ │ ├── phase_b_regenerator_eval.py # Phase B evaluation
│ │ │ └── debug_workflow_logger.py # Workflow debug logging
│ │ ├── models/ # SQLAlchemy ORM models
│ │ │ ├── embeddings.py # KnowledgeChunk model
│ │ │ ├── training_sessions.py # TrainingSession model
│ │ │ ├── embedding_links.py # Chunk-session links
│ │ │ ├── feedback.py # TrainingFeedback model
│ │ │ ├── curated_qa.py # CuratedQA model
│ │ │ ├── qa_suggestions.py # QASuggestion model
│ │ │ ├── workflow_templates.py # WorkflowTemplate model
│ │ │ ├── workflow_actions.py # WorkflowAction model
│ │ │ ├── workflow_executions.py # WorkflowExecution model
│ │ │ ├── workflow_execution_steps.py # Step results
│ │ │ ├── workflow_step_chunks.py # Chunks per step
│ │ │ ├── workflow_categories.py # Taxonomy
│ │ │ ├── workflow_vectors.py # Workflow embeddings
│ │ │ ├── workflow_candidates.py # Workflow candidate tracking
│ │ │ ├── required_tools.py # Tool tracking
│ │ │ ├── improvement_tasks.py # Action items
│ │ │ ├── answer_quality_signals.py # Quality signals
│ │ │ ├── active_learning_queue.py # Learning queue
│ │ │ ├── benchmark_results.py # Performance tracking
│ │ │ ├── learning_experiments.py # A/B testing
│ │ │ ├── dynamic_training_sessions.py # Training sessions
│ │ │ ├── reindex_state.py # Reindex state tracking
│ │ │ ├── conversation_context.py # Conversation context
│ │ │ ├── shadow_comparison.py # Shadow mode comparisons
│ │ │ ├── iterative_sessions.py # Iterative sessions
│ │ │ ├── session_feedback.py # Session feedback items
│ │ │ ├── offline_experiments.py # Offline experiment data
│ │ │ ├── phase_a_metrics.py # Phase A observability models
│ │ │ └── phase_b_regenerator_metrics.py # Phase B eval models
│ │ ├── schemas/ # Pydantic request/response schemas
│ │ ├── llm/ # LLM provider implementations
│ │ │ ├── base.py # ILLMProvider interface
│ │ │ ├── factory.py # Provider factory pattern
│ │ │ ├── openai_provider.py # OpenAI integration (gpt-4o-mini)
│ │ │ └── gemini_provider.py # Google Gemini integration (1.5-flash)
│ │ ├── training/ # Training & optimization
│ │ │ └── optimizer.py # Accuracy weight optimizer
│ │ ├── middleware/ # Authentication & middleware
│ │ │ └── auth.py # JWT & API key authentication
│ │ └── utils/ # Logging, rate limiting, utilities
│ ├── tests/ # Unit and integration tests
│ │ ├── unit/ # Unit tests
│ │ ├── integration/ # Integration tests
│ │ └── ingestion/ # Ingestion pipeline tests
│ ├── alembic/ # Database migrations
│ │ └── versions/ # Migration scripts
│ ├── requirements.txt # Python dependencies
│ ├── pytest.ini # Test configuration
│ └── Dockerfile # Backend container image
├── scripts/
│ ├── init-db.sql # Database initialization SQL
│ ├── initial_ingestion.py # Data ingestion script
│ ├── manual_retrain.py # Manual retraining trigger
│ └── ingestion/ # GitHub ingestion pipeline
│ ├── github_client.py # GitHub API client
│ ├── file_fetcher.py # Repository file fetcher
│ ├── text_chunker.py # Semantic text chunking
│ ├── embedding_processor.py # Embedding generation
│ └── db_writer.py # Database storage (PostgreSQL/pgvector)
├── frontend/
│ ├── src/
│ │ ├── api/ # API service layer
│ │ │ ├── client.ts # Axios client configuration
│ │ │ ├── queryClient.ts # TanStack React Query client
│ │ │ ├── admin.service.ts # Admin API calls
│ │ │ ├── chat.service.ts # Chat API calls
│ │ │ ├── training.service.ts # Training API calls
│ │ │ ├── workflow.service.ts # Workflow API calls
│ │ │ ├── health.service.ts # Health check API calls
│ │ │ └── dynamic-training.service.ts # Dynamic training API
│ │ ├── components/
│ │ │ ├── admin/
│ │ │ │ ├── DynamicTrainingChat.tsx
│ │ │ │ ├── DynamicTrainingQuickGuide.tsx
│ │ │ │ ├── WorkflowTrace.tsx
│ │ │ │ ├── WorkflowGapIndicator.tsx
│ │ │ │ ├── SessionInspectorModal.tsx
│ │ │ │ ├── SessionTable.tsx
│ │ │ │ ├── SessionRowActions.tsx
│ │ │ │ ├── QuickReviewCard.tsx
│ │ │ │ ├── FeedbackForm.tsx
│ │ │ │ ├── ChunkList.tsx
│ │ │ │ ├── ChunkCard.tsx
│ │ │ │ ├── ChunkSelectorModal.tsx
│ │ │ │ ├── ImprovementSuggestionsPanel.tsx
│ │ │ │ ├── CustomSuggestionModal.tsx
│ │ │ │ ├── Pagination.tsx
│ │ │ │ └── KeyboardShortcutsHelp.tsx
│ │ │ ├── chat/
│ │ │ │ ├── ChatInterface.tsx
│ │ │ │ ├── MessageList.tsx
│ │ │ │ ├── Message.tsx
│ │ │ │ └── MessageInput.tsx
│ │ │ ├── auth/
│ │ │ │ └── ProtectedRoute.tsx
│ │ │ └── ui/ # Reusable UI components
│ │ │ ├── Dialog.tsx
│ │ │ ├── Modal.tsx
│ │ │ ├── LoadingSpinner.tsx
│ │ │ └── EmptyState.tsx
│ │ ├── pages/ # Page components
│ │ │ ├── AdminLoginPage.tsx
│ │ │ ├── SessionsPage.tsx
│ │ │ ├── TrainingPage.tsx
│ │ │ └── NotFoundPage.tsx
│ │ ├── hooks/ # Custom React hooks
│ │ │ ├── useChat.ts
│ │ │ ├── useSession.ts
│ │ │ ├── useInspector.ts
│ │ │ ├── useSuggestions.ts
│ │ │ └── useKeyboardShortcuts.ts
│ │ ├── store/ # Zustand stores
│ │ │ └── chatStore.ts
│ │ ├── contexts/ # React contexts
│ │ │ └── AuthContext.tsx # Auth context with useAuth hook
│ │ ├── config/ # Configuration
│ │ │ ├── api.config.ts
│ │ │ └── app.config.ts
│ │ ├── types/ # TypeScript types
│ │ ├── utils/ # Utility functions
│ │ ├── App.tsx # Root component with routing
│ │ └── main.tsx # Entry point
│ ├── public/ # Static assets
│ ├── Dockerfile # Frontend container image
│ ├── vite.config.ts # Vite configuration
│ ├── tailwind.config.js # Tailwind configuration
│ └── package.json # Dependencies
├── docker-compose.yml # Docker Compose (4 services)
├── .env.example # Environment variable template
├── USER_MANUAL.md # Complete user guide with testing
├── DYNAMIC_TRAINING_MANUAL.md # Dynamic training reference
├── QUICK_START_DYNAMIC_TRAINING.md # 5-minute guide
└── README.md # This file
# Install backend dependencies locally (optional)
cd backend
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
# Run locally without Docker
python -m uvicorn app.main:app --reload --host 0.0.0.0 --port 8000# Start all services
docker compose up -d
# Start specific service
docker compose up -d backend
# Stop all services
docker compose down
# Stop without removing volumes
docker compose stop
# Restart service
docker compose restart backend
# Rebuild after code changes
docker compose up -d --build backend
# View logs
docker compose logs -f backend
# View resource usage
docker compose stats
# Clean up everything (including volumes)
docker compose down -v# Access PostgreSQL shell
docker compose exec postgres psql -U chatbot_user -d cloudvelous_chatbot
# Common SQL commands
\dt # List tables
\d tablename # Describe table
SELECT COUNT(*) FROM knowledge_chunks; # Count chunks
# Run migrations
docker compose exec backend alembic upgrade head
# Create new migration
docker compose exec backend alembic revision --autogenerate -m "Add new column"
# Rollback last migration
docker compose exec backend alembic downgrade -1
# View migration history
docker compose exec backend alembic historyThe project has comprehensive test coverage across unit, integration, and E2E test levels.
Test Categories:
- Unit tests: Test individual components in isolation (default, always run)
- Integration tests: Test component interactions with mocked dependencies (requires
-m integration) - E2E tests: Test complete workflows with real PostgreSQL database (requires
-m e2eand environment setup)
# Run unit tests only (default)
docker compose exec backend pytest
# Run all tests including integration tests
docker compose exec backend pytest -m "unit or integration"
# Run specific test categories
docker compose exec backend pytest -m unit # Unit tests only
docker compose exec backend pytest -m integration # Integration tests only
docker compose exec backend pytest -m e2e # E2E tests only (requires setup)
# Run tests for specific features
docker compose exec backend pytest tests/unit/test_embedder.py
docker compose exec backend pytest tests/integration/test_training_router.py
docker compose exec backend pytest tests/ingestion/ -v
# Run with coverage report
docker compose exec backend pytest --cov=app --cov-report=html
open htmlcov/index.html # View coverage report
# Run tests matching pattern
docker compose exec backend pytest -k "test_retrieval"
# Run with verbose output
docker compose exec backend pytest -vE2E Testing (Real Database):
E2E tests verify actual database behavior with PostgreSQL + pgvector:
# Set up E2E test environment
export TEST_DATABASE_URL="postgresql://user:pass@localhost:5432/testdb"
export RUN_E2E_TESTS=true
# Run E2E tests for training feedback
docker compose exec backend pytest tests/integration/test_training_e2e.py -m e2e -v
# Run specific E2E test
docker compose exec backend pytest tests/integration/test_training_e2e.py::TestTrainingFeedbackE2E::test_submit_feedback_updates_was_useful_flags_in_real_db -xvsTest Coverage:
- Unit tests: 150+ tests
- Integration tests: 50+ tests
- E2E tests: 20+ tests
- Frontend tests: Component and hook tests
- Total: 220+ tests with comprehensive coverage
# Format code with Black
docker compose exec backend black app/
# Lint with Flake8
docker compose exec backend flake8 app/
# Type checking with MyPy
docker compose exec backend mypy app/
# Run all quality checks
docker compose exec backend sh -c "black app/ && flake8 app/ && mypy app/"All configuration is managed through environment variables defined in .env.
| Variable | Description | Default | Required |
|---|---|---|---|
POSTGRES_PASSWORD |
PostgreSQL password | - | Yes |
DATABASE_URL |
Full database connection string | Auto-generated | No |
OPENAI_API_KEY |
OpenAI API key | - | If using OpenAI |
GEMINI_API_KEY |
Gemini API key | - | If using Gemini |
GITHUB_TOKEN |
GitHub API access token | - | For ingestion |
LLM_PROVIDER |
Default LLM provider | openai |
No |
| Variable | Description | Default | Required |
|---|---|---|---|
EMBED_PROVIDER |
Embedding provider | openai |
Yes |
OPENAI_EMBEDDING_MODEL |
OpenAI model name | text-embedding-3-small |
If using OpenAI |
OPENAI_EMBEDDING_DIMENSIONS |
Embedding dimensions | 1536 |
If using OpenAI |
SENTENCE_TRANSFORMERS_MODEL |
Sentence transformers model | all-MiniLM-L6-v2 |
If using sentence-transformers |
SENTENCE_TRANSFORMERS_DIMENSIONS |
Embedding dimensions | 384 |
If using sentence-transformers |
| Variable | Description | Default |
|---|---|---|
CURATED_QA_ENABLED |
Enable curated Q&A priority layer | true |
AGENTIC_WORKFLOW_ENABLED |
Enable multi-step workflows | true |
CONTEXTUAL_BOOST_ENABLED |
Enable query-specific learning | true |
AUTO_QUALITY_ASSESSMENT_ENABLED |
Enable auto-quality scoring | true |
ACTIVE_LEARNING_ENABLED |
Enable smart sample selection | true |
REFINE_ENABLED |
Enable LLM-based query refinement | true |
BENCHMARKING_ENABLED |
Enable performance tracking | true |
REINDEX_ENABLED |
Enable automatic reindexing | true |
DEBUG_WORKFLOW |
Enable detailed workflow tracing | false |
| Variable | Description | Default |
|---|---|---|
TOP_K_RETRIEVAL |
Number of chunks to retrieve | 5 |
WORKFLOW_BOOST_FACTOR |
Boost for workflow matches | 1.2 |
CHUNK_WEIGHT_ADJUSTMENT_RATE |
Learning rate for weights | 0.1 |
MIN_CHUNK_WEIGHT |
Minimum accuracy weight | 0.5 |
MAX_CHUNK_WEIGHT |
Maximum accuracy weight | 2.0 |
| Variable | Description | Default |
|---|---|---|
WORKFLOW_EMBEDDING_ENABLED |
Enable workflow learning | true |
WORKFLOW_SIMILARITY_WEIGHT |
Weight for workflow similarity | 0.3 |
FEEDBACK_THRESHOLD_FOR_RETRAIN |
Feedback count trigger | 50 |
MIN_WORKFLOW_CLUSTER_SIZE |
Minimum cluster size | 3 |
| Variable | Description | Default |
|---|---|---|
CANNOT_ANSWER_MIN_CONFIDENCE |
Minimum confidence for quality | 0.5 |
AUTO_LABEL_CONFIDENCE_THRESHOLD |
Threshold for auto-labeling | 0.8 |
ACTIVE_LEARNING_BUDGET |
Reviews per day | 50 |
| Variable | Description | Required |
|---|---|---|
ADMIN_JWT_SECRET |
JWT signing secret (min 32 chars) | Yes |
ADMIN_API_KEY |
Admin API authentication key | Yes |
JWT_ALGORITHM |
JWT signing algorithm | No (HS256) |
JWT_EXPIRATION_HOURS |
Token expiry time (hours) | No (24) |
# Follow all logs
docker compose logs -f
# Follow specific service
docker compose logs -f backend
# Last 100 lines
docker compose logs --tail=100 backend
# Export logs to file
docker compose logs > logs.txt
# Search logs for errors
docker compose logs backend | grep -i error# Backend health
curl http://localhost:8000/health
# Database connection
docker compose exec backend python -c "
from app.models.database import SessionLocal
db = SessionLocal()
print('Connected:', db.execute('SELECT 1').scalar())
db.close()
"
# Check embedder
docker compose exec backend python -c "
from app.services.embedder import EmbeddingService
embedder = EmbeddingService()
print('Provider:', embedder.provider, '| Dimensions:', embedder.dimension)
"# Container resource usage
docker compose stats
# Database stats
docker compose exec postgres psql -U chatbot_user -d cloudvelous_chatbot -c "
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
"
# API endpoint stats (requires admin API key)
curl -H "X-API-Key: your-admin-api-key" \
http://localhost:8000/api/admin/statsBackend won't start - "ModuleNotFoundError"
# Rebuild container
docker compose up -d --build backendDatabase connection refused
# Check PostgreSQL is running
docker compose ps postgres
# Restart PostgreSQL
docker compose restart postgres
# Wait a few seconds and restart backend
sleep 5
docker compose restart backendPort already in use
# Find what's using the port
lsof -i :8000
# Kill the process or change port in docker-compose.ymlSlow query responses
# Check database indices
docker compose exec postgres psql -U chatbot_user -d cloudvelous_chatbot -c "\di"
# Check embedding count
docker compose exec postgres psql -U chatbot_user -d cloudvelous_chatbot -c \
"SELECT COUNT(*) FROM knowledge_chunks;"For more detailed troubleshooting, check the Docker logs and health endpoints described above.
The system includes a production-ready GitHub repository ingestion pipeline for automated documentation processing with comprehensive testing.
The GitHub ingestion pipeline is production-ready with comprehensive testing:
- ✅ Complete end-to-end ingestion workflow
- ✅ OpenAI embeddings (1536-dim) in production use
- ✅ Transaction management with rollback support
- ✅ Comprehensive test coverage (220+ tests)
- ✅ CLI interface for repository ingestion
- ✅ Batch processing and error handling
- Repository file fetching with type filtering (markdown, code, docs)
- Recursive directory traversal with skip patterns
- Smart text chunking by markdown headers and paragraphs
- Vector embedding generation with OpenAI text-embedding-3-small (1536-dim)
- Database storage with PostgreSQL and pgvector extension
- Transaction management with rollback on failure
- Repository-level delete and re-insert strategy
- Full orchestration pipeline with CLI (
scripts/initial_ingestion.py) - Progress tracking and summary reporting
- Error handling and partial failure recovery
- End-to-end integration testing with real GitHub repositories
- Support for batch processing and lazy model loading
For incremental repository updates with change detection:
- Use the
--clear-firstflag to delete and re-ingest a repository - SHA-based change detection available for future optimization
The complete ingestion pipeline is now available via command-line interface:
# Ingest a single repository
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo owner/repo-name"
# Ingest multiple repositories
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo owner/repo1 --repo owner/repo2 --repo owner/repo3"
# Clear existing data before ingestion
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo owner/repo-name --clear-first"You can test the implemented components:
# Run all tests
docker compose exec backend pytest -v
# Run all ingestion tests
docker compose exec backend pytest tests/ingestion/ -v
# Run end-to-end integration tests
docker compose exec backend pytest tests/ingestion/test_e2e.py -m e2e -v
# Test individual components
docker compose exec backend pytest tests/ingestion/test_file_fetcher.py -v
docker compose exec backend pytest tests/ingestion/test_text_chunker.py -v
docker compose exec backend pytest tests/ingestion/test_embedding_processor.py -v
docker compose exec backend pytest tests/ingestion/test_db_writer.py -v
docker compose exec backend pytest tests/ingestion/test_orchestration.py -v
# Test embedder fallback logic
docker compose exec backend pytest tests/unit/test_embedder_fallback.py -v
# Manual component testing
docker compose exec backend python scripts/test_file_fetcher.py
docker compose exec backend python scripts/test_text_chunker.pyTest Results:
- Comprehensive test coverage across all ingestion components
- Unit, integration, and E2E tests included
- All tests passing with high coverage
Add to your .env file:
# GitHub API access token
GITHUB_TOKEN=ghp_your_github_token_hereThe orchestration pipeline provides a complete command-line interface:
# Basic usage - ingest a repository
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo anthropics/anthropic-sdk-python"
# Ingest multiple repositories at once
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo fastapi/fastapi --repo pydantic/pydantic --repo psycopg/psycopg"
# Clear all existing data before ingestion
docker compose exec backend bash -c "PYTHONPATH=/app python scripts/initial_ingestion.py --repo owner/repo --clear-first"
# The script will:
# 1. Fetch all files from the repository
# 2. Chunk them into semantic pieces
# 3. Generate embeddings for each chunk
# 4. Store them in PostgreSQL with pgvector
# 5. Display progress and summary statisticsYou can also use the components programmatically:
from scripts.ingestion.github_client import GitHubClient
from scripts.ingestion.file_fetcher import FileFetcher
from scripts.ingestion.text_chunker import TextChunker
from scripts.ingestion.embedding_processor import EmbeddingProcessor
from scripts.ingestion.db_writer import DatabaseWriter
# Initialize components
client = GitHubClient()
fetcher = FileFetcher(include_markdown=True, include_code=True)
chunker = TextChunker(target_chunk_size=1500)
processor = EmbeddingProcessor()
writer = DatabaseWriter()
# Process repository
repo = client.get_repository("owner/repo-name")
files = fetcher.fetch_repository_files(repo)
all_chunks = []
for file in files:
chunks = chunker.chunk_file(file)
all_chunks.extend(chunks)
embedded_chunks = processor.embed_chunks(all_chunks)
result = writer.write_repository("owner/repo", embedded_chunks)
print(f"Successfully ingested {result['inserted_count']} chunks")
writer.close()The system includes LLM-powered query refinement to improve retrieval quality:
- Automatic Enhancement: Ambiguous queries are clarified before retrieval
- Context Preservation: User intent maintained while adding specificity
- Configurable: Enable/disable via
REFINE_ENABLEDflag - Multi-LLM Support: Works with OpenAI GPT-4o-mini or Gemini
Usage: Query refinement happens automatically when enabled. View refined queries in session inspector.
Configuration:
REFINE_ENABLED=trueAutomatic quality assessment provides multi-signal scoring:
- Retrieval Confidence: Vector similarity scores
- Coherence: Answer structure and completeness
- Source Agreement: Consistency across retrieved chunks
- Historical Similarity: Comparison with past successful answers
- Completeness: Coverage of query aspects
- Smart sample selection for human review
- Confidence-based prioritization
- Batch review interface in admin panel
- Automatic labeling for high-confidence cases
Configuration:
AUTO_QUALITY_ASSESSMENT_ENABLED=true
ACTIVE_LEARNING_ENABLED=true
ACTIVE_LEARNING_BUDGET=50
CANNOT_ANSWER_MIN_CONFIDENCE=0.5
AUTO_LABEL_CONFIDENCE_THRESHOLD=0.8Automatic vector index reindexing for optimal query performance:
- Monitors query performance metrics
- Automatically triggers reindex when degradation detected
- Zero-downtime reindexing
- Configurable via
REINDEX_ENABLEDflag
- React Query caching on frontend
- Session prefetching for faster navigation
- Optimistic updates for better UX
- Request ID tracking for debugging
- Structured JSON logging (production)
- Performance metrics in admin dashboard
- Benchmark tracking for continuous monitoring
Configuration:
REINDEX_ENABLED=true
BENCHMARKING_ENABLED=true
LOG_JSON=true # For production
LOG_LEVEL=INFO- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature - Make your changes
- Write/update tests
- Run tests and linting:
docker compose exec backend pytest && black app/ - Commit:
git commit -m "Add your feature" - Push:
git push origin feature/your-feature - Create a Pull Request
- Follow PEP 8 guidelines
- Use type hints for all function parameters
- Write docstrings for public functions
- Keep functions focused and single-purpose
- Add tests for new features
MIT License - See LICENSE file for details
- User Manual: USER_MANUAL.md - Complete testing guide
- Dynamic Training:
- Quick Start - 5-minute guide
- Full Manual - Complete reference
- Issues: Open an issue on GitHub
- Questions: Use GitHub Discussions
- FastAPI for the excellent web framework
- PostgreSQL pgvector for vector similarity search
- OpenAI for embeddings (text-embedding-3-small) and LLM APIs
- Google for the Gemini LLM API
- Sentence Transformers as an alternative embedding provider