Works with opencode, crush, claude code router, roo code, cline and anything that speaks the OpenAI API. Has tool calling and streaming support.
New — Minimal terminal UI with full mouse support — lightweight, low resource usage.
New —
coder-modelnow points to Qwen 3.6 Plus (Qwen team updated the alias).qwen3.5-plusrequires Coding Plan subscription and won't work with OAuth accounts.
Important: Users may hit 504 / timeout errors at 130k–150k+ token contexts — this is a Qwen upstream limit.
For a serverless/edge alternative: qwen-worker-proxy
npm install -g qwen-proxyqwen-proxy
# and then add some accounts from the tui . full mouse SupportedOr headless (background/server mode):
qwen-proxy serve --headlessPoint your client at http://localhost:8080/v1. API key can be any string.
git clone https://github.com/aptdnfapt/qwen-code-oai-proxy
cd qwen-code-oai-proxy
cp .env.example .env
docker compose up -dThe container mounts ~/.qwen from your host — accounts you add are picked up live by the running container without a restart.
Add an account while the container is running:
docker compose exec qwen-proxy node dist/src/cli/qwen-proxy.js auth add myaccountPoint your client at http://localhost:8080/v1.
npm install
npm run auth:add myaccount
qwen-proxy serve
# or headless:
npm run serve:headlessqwen-proxy serve # TUI dashboard
qwen-proxy serve --headless # headless server
qwen-proxy auth list
qwen-proxy auth add <account-id>
qwen-proxy auth remove <account-id>
qwen-proxy auth counts
qwen-proxy usageFor fresh-machine regressions, use the built-in clean-home checks instead of ad-hoc shell probes:
npm run test:auth-clean-home
npm run test:first-run
npm run test:install-smokeThese scripts run the compiled code with a temporary HOME, so they simulate a new machine without touching your real ~/.qwen or local usage database.
More detail: docs/testing-clean-home.md
The easiest way to add accounts is from the TUI — just run qwen-proxy, go to the Accounts tab, and add from there.
You can also add via CLI:
qwen-proxy auth add account1
qwen-proxy auth add account2
qwen-proxy auth add account3How rotation works:
- Requests rotate round-robin across all valid accounts
- Tokens refreshed ahead of expiry automatically
- Auth failures → one refresh attempt → rotate to next account
- Transient failures (429, 500, timeout) → rotate to next account, no cooldowns
- Client errors (bad payload etc.) → returned immediately, no rotation
DEFAULT_ACCOUNTenv var → that account is tried first- Request counts reset daily at UTC midnight
For Docker:
docker compose exec qwen-proxy node dist/src/cli/qwen-proxy.js auth list
docker compose exec qwen-proxy node dist/src/cli/qwen-proxy.js auth add <account-id>
docker compose exec qwen-proxy node dist/src/cli/qwen-proxy.js auth remove <account-id>| Model ID | Description | Max Tokens | Notes |
|---|---|---|---|
coder-model |
Recommended — Qwen 3.6 Plus (alias, auto-updated by Qwen) | 65536 | Default, best for coding |
qwen3.5-plus |
Alias → resolves to coder-model |
65536 | Kept for backward compatibility |
qwen3.6-plus |
Alias → resolves to coder-model |
65536 | |
qwen3-coder-plus |
Qwen 3 Coder Plus | 65536 | |
qwen3-coder-flash |
Qwen 3 Coder Flash | 65536 | Faster, lighter |
vision-model |
Multimodal with image support | 32768 | Lower token limit, auto-clamped |
Note:
coder-modelis an alias maintained by Qwen. It was Qwen 3.5 Plus, now updated to Qwen 3.6 Plus. Bothqwen3.5-plusandqwen3.6-plusresolve tocoder-model.
POST /v1/chat/completions— Chat completions (streaming + non-streaming)GET /v1/models— List available modelsPOST /v1/web/search— Web search (2000 req/day free)GET/POST /mcp— MCP server (SSE transport)GET /health— Health check
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: 'fake-key',
baseURL: 'http://localhost:8080/v1'
});
const response = await openai.chat.completions.create({
model: 'coder-model',
messages: [{ role: 'user', content: 'Hello!' }]
});
console.log(response.choices[0].message.content);curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer fake-key" \
-d '{
"model": "coder-model",
"messages": [{"role": "user", "content": "Hello!"}],
"temperature": 0.7,
"max_tokens": 200,
"reasoning": {"effort": "high"}
}'
effortcan be"high","medium","low", or"none"(disables thinking).
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer fake-key" \
-d '{
"model": "coder-model",
"messages": [{"role": "user", "content": "Explain how to reverse a string in JavaScript."}],
"stream": true,
"max_tokens": 300,
"reasoning": {"effort": "medium"}
}'Free web search — 1000 requests/account/day:
curl -X POST http://localhost:8080/v1/web/search \
-H "Content-Type: application/json" \
-H "Authorization: Bearer fake-key" \
-d '{"query": "latest AI developments", "page": 1, "rows": 5}'Add to ~/.config/opencode/opencode.json:
effortcan be"high","medium","low", or"none"(disables thinking). change it from the ctrl-t key bind on opencode
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"qwen": {
"npm": "@ai-sdk/openai-compatible",
"name": "proxy",
"options": {
"baseURL": "http://localhost:8080/v1"
},
"models": {
"coder-model": {
"name": "qwen3.6-plus" ,
"reasoning": true,
"modalities": {
"input": [
"text",
"image"
],
"output": [
"text"
]
},
"attachment": true,
"limit": {
"context": 195000,
"output": 60000
}
}
}
}
}
}Add to ~/.config/crush/crush.json:
{
"$schema": "https://charm.land/crush.json",
"providers": {
"proxy": {
"type": "openai",
"base_url": "http://localhost:8080/v1",
"api_key": "",
"models": [
{
"id": "coder-model",
"name": "coder-model",
"cost_per_1m_in": 0.0,
"cost_per_1m_out": 0.0,
"cost_per_1m_in_cached": 0,
"cost_per_1m_out_cached": 0,
"context_window": 150000,
"default_max_tokens": 32768
}
]
}
}
}{
"LOG": false,
"Providers": [
{
"name": "qwen-code",
"api_base_url": "http://localhost:8080/v1/chat/completions/",
"api_key": "any-string",
"models": ["coder-model"],
"transformer": {
"use": [
["maxtoken", {"max_tokens": 32768}],
"enhancetool",
"cleancache"
]
}
}
],
"Router": {
"default": "qwen-code,coder-model"
}
}- Go to settings → choose OpenAI Compatible
- Set URL:
http://localhost:8080/v1 - API key: any random string
- Model:
coder-model - Disable streaming checkbox (Roo Code / Kilo Code)
- Max output:
32000 - Context window: up to 300k (but past 150k gets slower)
Add to ~/.config/opencode/config.json:
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"qwen-web-search": {
"type": "remote",
"url": "http://localhost:8080/mcp",
"headers": {
"Authorization": "Bearer your-api-key"
}
}
}
}Omit headers if you have no API key set. Works with other MCP clients too.
# Single key
API_KEY=your-secret-key
# Multiple keys
API_KEY=key1,key2,key3Supported headers:
Authorization: Bearer your-secret-keyX-API-Key: your-secret-key
If no API key is configured, no auth is required.
Set via environment variables or .env file:
| Variable | Default | Description |
|---|---|---|
PORT |
8080 |
Server port |
HOST |
localhost |
Bind address (0.0.0.0 for Docker) |
API_KEY |
— | Comma-separated auth keys |
DEFAULT_ACCOUNT |
— | Account to prefer first |
LOG_LEVEL |
error-debug |
off / error / error-debug / debug |
MAX_DEBUG_LOGS |
20 |
Max request debug dirs to keep |
QWEN_PROXY_HOME |
~/.local/share/qwen-proxy |
Override runtime data dir |
QWEN_PROXY_LOG_DIR |
— | Override log dir |
Compatibility aliases: DEBUG_LOG=true → LOG_LEVEL=debug, LOG_FILE_LIMIT → MAX_DEBUG_LOGS
Example .env:
LOG_LEVEL=debug
MAX_DEBUG_LOGS=10
API_KEY=your-secret-key
DEFAULT_ACCOUNT=my-primary-accountPort and host can also be changed from the TUI Settings screen and are saved to config.json automatically.
| Path | Contents |
|---|---|
~/.qwen/oauth_creds_<id>.json |
Account credentials |
~/.local/share/qwen-proxy/usage.db |
Request + token usage (SQLite) |
~/.local/share/qwen-proxy/config.json |
Port, host, log level, auto-start |
~/.local/share/qwen-proxy/log/ |
Error logs |
curl http://localhost:8080/healthReturns server status, account validation, token expiry info, request counts.
qwen-proxy usage
# or
npm run usage
npm run tokensShows daily token usage, cache hits, request counts per account. Also visible in the TUI Usage screen.
Change live without restart:
# inspect
GET /runtime/log-level
# change
POST /runtime/log-level
{"level": "debug"}
# change without persisting
POST /runtime/log-level
{"level": "error", "persist": false}