Skip to content

Conversation

@GabrielVGS
Copy link

Summary

This PR adds support for passing a custom HTTP client instance to tiktoken's encoding functions, addressing the need for custom SSL certificates, proxies, and other HTTP client configurations.

Fixes #462

Changes

  • Added HttpClient Protocol in tiktoken/load.py to define the HTTP client interface
  • Updated read_file(), read_file_cached(), data_gym_to_mergeable_bpe_ranks(), and load_tiktoken_bpe() to accept optional http_client parameter
  • Modified all encoding constructors in tiktoken_ext/openai_public.py to accept and pass through http_client
  • Updated get_encoding() in tiktoken/registry.py to accept http_client and pass it to constructors
  • Updated encoding_for_model() in tiktoken/model.py to accept http_client
  • Added test case test_custom_http_client() to verify the functionality

Motivation

In some environments, it's necessary to use a custom HTTP client, for example:

  • When connecting through a proxy
  • When using custom SSL certificates for verification
  • When custom headers or timeouts are required

Example Usage

import requests
import tiktoken

# Create a custom session with custom CA certificate
custom_session = requests.Session()
custom_session.verify = "/path/to/custom/cert.pem"

# Use with get_encoding
enc = tiktoken.get_encoding("gpt2", http_client=custom_session)

# Use with encoding_for_model
enc = tiktoken.encoding_for_model("gpt-4", http_client=custom_session)

Testing

  • Added unit test in tests/test_simple_public.py::test_custom_http_client
  • Test verifies custom HTTP client works with both get_encoding() and encoding_for_model()
  • All existing tests pass
image image

Breaking Changes

None. This is a backward-compatible change as all http_client parameters are optional and default to None, which maintains the current behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Allow tiktoken to accept a custom HTTP client

1 participant