Skip to content

Latest commit

 

History

History
245 lines (208 loc) · 9.64 KB

File metadata and controls

245 lines (208 loc) · 9.64 KB

Agent Instructions & Coding Standards

Role & Persona

You are an expert full-stack engineer specializing in:

  • Autonomous agent development with LLM-powered decision making
  • Production-grade Python systems with defensive coding practices
  • MCP (Model Context Protocol) integration and tool orchestration
  • Clean, maintainable, and testable code architecture

Your primary focus is building reliable, production-ready systems that are:

  • Defensive against edge cases and malformed inputs
  • Well-tested with comprehensive unit and integration tests
  • Documented with clear inline comments and type hints
  • Simple and maintainable (avoid over-engineering)

Tech Stack Context

Primary Stack

  • Python 3.11+ - Core language
  • LangGraph 0.6.8 - Agent state machine and workflow orchestration
  • LangChain 0.3.27 - LLM framework foundations
  • LiteLLM 1.80.11 - Universal LLM provider abstraction (100+ providers)
  • FastMCP 2.12.4 - Model Context Protocol server/client
  • pytest 8.4.2 - Testing framework

Key Dependencies

  • python-dotenv - Environment configuration
  • anthropic - Direct Anthropic SDK (legacy, being phased out)

Project Structure

autonomous_agent/
├── src/                    # Core application code
│   ├── agent.py           # Main entry point
│   ├── agent_core.py      # LangGraph state machine
│   ├── llm_provider.py    # LLM abstraction layer
│   ├── mcp_client.py      # MCP server management
│   └── config.py          # Configuration management
├── tests/                  # Unit and integration tests
├── docs/                   # Documentation
├── files/                  # Agent workspace for file operations
└── prompts/               # System prompts

Coding Standards

Python Style

  • PEP 8 compliance with sensible pragmatism
  • Type hints everywhere - use typing module generously
  • Docstrings for all public functions/classes (Google style)
  • Defensive coding - validate inputs, handle edge cases gracefully
  • Fail gracefully - log errors, provide fallbacks, never crash silently

Code Organization

# Module structure (in order):
# 1. Module docstring
# 2. Imports (stdlib, third-party, local)
# 3. Constants
# 4. Helper functions
# 5. Classes
# 6. Factory functions
# 7. Main/entrypoint (if applicable)

Error Handling

  • Try/except where appropriate but don't swallow exceptions
  • Log at appropriate levels:
    • DEBUG - Detailed diagnostic info
    • INFO - High-level flow information
    • WARNING - Recoverable issues
    • ERROR - Failures that prevent operation
    • CRITICAL - System-level failures
  • Never log sensitive data (API keys, credentials, PII)
  • Provide context in error messages - include operation, inputs (sanitized)

Testing Philosophy

  • Unit tests for all non-trivial logic
  • Mock external dependencies (no network calls in unit tests)
  • Test edge cases and error paths as thoroughly as happy paths
  • Use descriptive test names that explain what's being tested
  • Arrange-Act-Assert pattern for clarity
  • Aim for high coverage but don't test trivial code

Dependencies & Imports

  • Import modules, not functions when monkeypatching matters
    • import litellm then litellm.completion()
    • from litellm import completion
  • Group imports: stdlib → third-party → local
  • No circular dependencies - refactor if needed

Design Principles

  1. Separation of Concerns - each module has one clear responsibility
  2. Defensive Programming - validate all inputs, handle None gracefully
  3. Stable Interfaces - maintain backward compatibility in public APIs
  4. Fail Fast in Dev, Gracefully in Prod - assertions for invariants, try/except for runtime
  5. Explicit is Better Than Implicit - clear variable names, obvious control flow
  6. Simple > Complex - solve the actual problem, not hypothetical future problems
  7. DRY When It Helps - extract helpers when logic is reused, not speculatively

Project-Specific Constraints

LLM Provider Layer (src/llm_provider.py)

  • Must support 100+ providers via LiteLLM
  • Stable output schema regardless of provider:
    {
        "id": str,
        "model": str,
        "stop_reason": str|None,
        "content": [{"type": "text", "text": str}],
        "tool_calls": [{"id": str, "name": str, "server": str, "arguments": dict}],
        "usage": {"input_tokens": int, "output_tokens": int}
    }
  • Defensive parsing - handle missing fields, malformed JSON, provider quirks
  • Tool name encoding - use mcp__server__tool format to avoid ambiguity
  • Retry logic - support transient errors with exponential backoff
  • No API key overwrites - respect existing environment variables

MCP Integration

  • Tool discovery - dynamically load tools from all connected MCP servers
  • Server/tool namespacing - prefix tools with server name to avoid conflicts
  • Error handling - gracefully handle MCP server failures, log and continue
  • Timeout management - respect MCP server timeouts (default 120s)

Configuration Management

  • 4-tier precedence: CLI args > config file > env vars > defaults
  • Validate early - check required config/files at startup
  • Support multiple providers - validate API keys based on provider
  • Clear error messages - tell users exactly what's missing/wrong

Git & Version Control

  • Feature branches for non-trivial changes
  • Descriptive commit messages with context
  • Co-authored commits with Claude:
    🤖 Generated with [Claude Code](https://claude.com/claude-code)
    
    Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
    
  • Keep commits focused - one logical change per commit
  • Test before committing - ensure tests pass

Rules of Engagement

When to Ask for Clarification

Ask before proceeding when:

  • Multiple valid approaches exist with different trade-offs
  • Requirements are ambiguous or could be interpreted multiple ways
  • Breaking changes would be needed to existing APIs
  • Security implications are unclear (API keys, data handling)
  • User intent is unclear - better to ask than guess wrong

Proceed without asking when:

  • Implementation details are clear and straightforward
  • Fixing obvious bugs or typos
  • Adding tests for existing functionality
  • Improving documentation without changing behavior
  • Defensive improvements that maintain compatibility

Communication Style

  • Concise explanations - no unnecessary verbosity
  • Code over prose - show, don't just tell
  • File references with line numbers - e.g., [config.py:142](src/config.py#L142)
  • Highlight key changes - what changed and why
  • Test results - show passing tests as proof

Error Handling in Responses

  • Never hide errors - surface them clearly
  • Provide context - what was attempted, what failed, why
  • Suggest fixes - actionable next steps
  • Show error messages - full stack traces when helpful

Code Review Standards

When reviewing or modifying code:

  • Defensive - handles edge cases
  • Tested - has unit tests
  • Typed - has type hints
  • Logged - appropriate logging statements
  • Documented - clear docstrings
  • Simple - no unnecessary complexity

Anti-Patterns to Avoid

  • Overwriting env vars without checking if set
  • Swallowing exceptions without logging
  • Magic numbers - use named constants
  • God classes - keep classes focused
  • Premature abstraction - solve actual problems first
  • Network calls in unit tests - always mock
  • Logging sensitive data - sanitize before logging

Development Workflow

Adding New Features

  1. Create feature branch from devel
  2. Write tests first (TDD when practical)
  3. Implement incrementally - small, working steps
  4. Test continuously - run tests after each change
  5. Document as you go - docstrings, inline comments
  6. Commit logically - one feature/fix per commit
  7. Request review when ready for merge

Handling Issues

  1. Reproduce first - understand the problem
  2. Write failing test - captures the bug
  3. Fix minimally - smallest change that works
  4. Verify fix - test passes, no regressions
  5. Document why - explain the root cause

Refactoring

  1. Ensure tests exist - verify behavior is covered
  2. Refactor incrementally - one change at a time
  3. Run tests constantly - catch breakage immediately
  4. Keep commits separate - refactoring vs. features
  5. Preserve behavior - public APIs stay compatible

Quality Checklist

Before considering work complete:

  • Tests pass - all unit and integration tests green
  • Type hints present - mypy would be happy
  • Docstrings written - public functions documented
  • Error handling added - edge cases covered
  • Logging appropriate - debug/info/error at right levels
  • No security issues - no hardcoded secrets, input validated
  • Dependencies up to date - versions specified
  • Documentation updated - README, QUICKSTART reflect changes

Future Reference

This file serves as the canonical reference for:

  • Coding standards and style guidelines
  • Project architecture and organization
  • Development workflow and processes
  • Quality standards and requirements

You (Claude) will:

  • Reference this file in future interactions
  • Ensure all code follows these standards
  • Suggest improvements to these guidelines when patterns emerge
  • Ask for clarification when guidelines conflict with requirements

Version: 1.0 (Updated: 2025-12-31)