Skip to content

feat: add configurable token cache for customer flow#133

Open
vitalykumov wants to merge 2 commits into
SAP:mainfrom
vitalykumov:feat/token-caching
Open

feat: add configurable token cache for customer flow#133
vitalykumov wants to merge 2 commits into
SAP:mainfrom
vitalykumov:feat/token-caching

Conversation

@vitalykumov
Copy link
Copy Markdown

Description

Add in-process token cache for customer agent flow. Previously every list_mcp_tools() / call_mcp_tool() call fetched fresh IAS token via mTLS - unnecessary latency in agentic loops.

Changes:

  • _token_cache.py (new): _TokenCache - TTL + LRU eviction for system tokens (key: app_tid) and user tokens (key: sha256(user_jwt+"|"+app_tid)[:16]). Expiry is from expires_in, id_token exp claim, or fallback TTL.
  • _customer.py: get_system_token_mtls / exchange_user_token to consult/populate cache. 401 response from MCP server → invalidate stale token + retry once.
  • agw_client.py: AgentGatewayClient owns _TokenCache. Exposes clear_token_cache() for forced refresh (revoked creds, tenant change).
  • config.py: 4 new ClientConfig fields added - token_expiry_buffer_seconds (60 s), max_system_token_cache_size (10 s), max_user_token_cache_size (10 s), fallback_token_ttl_seconds (300 s).

LoB flow unaffected - delegates to BTP Destination Service.

Type of Change

Please check the relevant option:

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Code refactoring
  • Dependency update

How to Test

Describe how reviewers can test your changes:

  1. Programmatically:
    async def verify_token_caching(agw_client) -> bool:
        """Verify system token is cached across list_mcp_tools calls.
    
        Patches _request_token_mtls to count real HTTP token requests.
        First call must fetch at least one token; second call must reuse cache.
        Returns True if caching works correctly.
        """
        original = _customer_mod._request_token_mtls
        call_count = 0
    
        def counting(*args, **kwargs):
            nonlocal call_count
            call_count += 1
            return original(*args, **kwargs)
    
        _customer_mod._request_token_mtls = counting
        try:
            await agw_client.list_mcp_tools()
            after_first = call_count
    
            await agw_client.list_mcp_tools()
            after_second = call_count
    
            agw_client.clear_token_cache()
            await agw_client.list_mcp_tools()
            after_clear = call_count
        finally:
            _customer_mod._request_token_mtls = original
    
        cache_hit = after_second == after_first
        invalidate_works = after_clear > after_second
    
        print("\nToken Caching")
        print("-" * 40)
        print(f"  1st call  token requests : {after_first}")
        print(f"  2nd call  token requests : {after_second - after_first}  {'✓ cache hit' if cache_hit else '✗ expected 0'}")
        print(f"  post-clear token requests: {after_clear - after_second}  {'✓ re-fetched' if invalidate_works else '✗ expected ≥1'}")
    
        return cache_hit and invalidate_works
    ...
    async def main():
        agw_client = create_client()
    
        ok = await verify_token_caching(agw_client)
        if not ok:
            print("\nWARNING: token caching check failed")
    ...
  2. Unit tests:
    pytest tests/agentgateway/unit/test_token_cache.py tests/agentgateway/unit/test_customer.py tests/agentgateway/unit/test_agw_client.py

Checklist

Before submitting your PR, please review and check the following:

  • I have read the Contributing Guidelines
  • I have verified that my changes solve the issue
  • I have added/updated automated tests to cover my changes
  • All tests pass locally
  • I have verified that my code follows the Code Guidelines
  • I have updated documentation (if applicable)
  • I have added type hints for all public APIs
  • My code does not contain sensitive information (credentials, tokens, etc.)
  • I have followed Conventional Commits for commit messages

Breaking Changes

None. Cache internal to AgentGatewayClient - existing create_client() calls get caching automatically. ClientConfig new fields all have defaults.

Additional Notes

Thread safety: GIL makes individual OrderedDict ops atomic, but check-then-set is not. Concurrent coroutines on same key may both miss and both fetch - redundant requests, not corruption. Acceptable for agentic loop use case.

401 retry: Both get_mcp_tools_customer and call_mcp_tool_customer invalidate + retry once on 401, handling server-side revocation before token expiry.

@vitalykumov vitalykumov requested a review from a team as a code owner May 21, 2026 07:50
@cla-assistant
Copy link
Copy Markdown

cla-assistant Bot commented May 21, 2026

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant