Skip to content

Commit 5a01f8d

Browse files
author
Mateusz
committed
feat: add Kimi Code connector and backend startup disablement
- Add Kimi Code connector implementation with OpenAI-compatible API - Add backend startup disablement service for selective backend loading - Update Gemini base connector and model validation - Update backend configuration provider with environment-based config - Update application builder to support backend disablement - Update models controller for modality support - Add comprehensive unit tests for new components - Add Kimi Code backend documentation - Update .gitignore to exclude var/cache/ directory
1 parent 9106c6e commit 5a01f8d

25 files changed

+1150
-179
lines changed

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -125,4 +125,7 @@ var/gemini_oauth_accounts/
125125
var/kiro_oauth_accounts/
126126

127127
# LevelDB test data (may contain sensitive data)
128-
leveldb_test/
128+
leveldb_test/
129+
130+
# Cache directory
131+
var/cache/
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
# Kimi Code Backend
2+
3+
The `kimi-code` backend provides an OpenAI-compatible connector for Kimi's coding gateway.
4+
5+
It is implemented as a subclass of the OpenAI-compatible backend connector and targets:
6+
7+
- Base URL: `https://api.kimi.com/coding/v1`
8+
9+
This backend is intended to be used via the [OpenAI Chat Completions frontend](../frontends/openai-chat-completions.md).
10+
11+
## Configuration
12+
13+
### Environment Variables
14+
15+
Set the API key:
16+
17+
```bash
18+
export KIMI_API_KEY="..."
19+
```
20+
21+
## Model Naming
22+
23+
The proxy exposes a single model through this backend:
24+
25+
- `kimi-for-coding`
26+
27+
When calling the OpenAI Chat Completions frontend, use the fully-qualified model string:
28+
29+
- `kimi-code:kimi/kimi-for-coding`
30+
31+
Example:
32+
33+
```bash
34+
curl -X POST http://localhost:8000/v1/chat/completions \
35+
-H "Content-Type: application/json" \
36+
-d '{
37+
"model": "kimi-code:kimi/kimi-for-coding",
38+
"messages": [{"role": "user", "content": "Write a Python function that prints Hello World."}],
39+
"stream": true
40+
}'
41+
```
42+
43+
## Multimodal (Text + Image)
44+
45+
This backend advertises the model as accepting:
46+
47+
- Input modalities: `text`, `image`
48+
- Output modalities: `text`
49+
50+
Example (OpenAI-compatible `image_url` message parts):
51+
52+
```bash
53+
curl -X POST http://localhost:8000/v1/chat/completions \
54+
-H "Content-Type: application/json" \
55+
-d '{
56+
"model": "kimi-code:kimi/kimi-for-coding",
57+
"messages": [{
58+
"role": "user",
59+
"content": [
60+
{"type": "text", "text": "Describe what you see and suggest refactors."},
61+
{"type": "image_url", "image_url": {"url": "https://example.com/screenshot.png"}}
62+
]
63+
}],
64+
"stream": true
65+
}'
66+
```
67+
68+
## Reasoning Output Compatibility
69+
70+
Some OpenAI-compatible providers stream text using `reasoning_content` while leaving `content` empty.
71+
Many clients only render `content`.
72+
73+
The `kimi-code` connector mirrors reasoning text into `content` when needed, while keeping the original
74+
reasoning fields intact. This makes the backend usable with clients that do not understand
75+
`reasoning_content`.
76+
77+
## Related Documentation
78+
79+
- [Backend Overview](overview.md)
80+
- [OpenAI Chat Completions Frontend](../frontends/openai-chat-completions.md)

docs/user_guide/backends/overview.md

Lines changed: 18 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -17,15 +17,16 @@ The proxy supports the following backend providers out of the box:
1717
| `gemini-oauth-plan` | Google Gemini (CLI) | OAuth | Users with Google One subscription |
1818
| `gemini-oauth-free` | Google Gemini (CLI) | OAuth | Free tier users |
1919
| `gemini-cli-cloud-project` | Google Gemini (GCP) | OAuth + GCP Project | Enterprise, team workflows, central billing |
20-
| `openrouter` | OpenRouter | API Key | Access to many hosted models |
21-
| `zenmux` | ZenMux | API Key | OpenAI-compatible ZenMux router |
22-
| `zai` | ZAI | API Key | Zhipu/Z.ai access |
23-
| `zai-coding-plan` | ZAI Coding Plan | API Key | Coding-specific workflows |
24-
| `minimax` | Minimax | API Key | Minimax AI models |
25-
| `qwen-oauth` | Alibaba Qwen | Local OAuth token | Qwen CLI OAuth |
26-
| `qwen-oauth` | Alibaba Qwen | Local OAuth token | Qwen CLI OAuth |
27-
| `hybrid` | Virtual (orchestrates two models) | Inherits from sub-backends | Two-phase reasoning + execution |
28-
| `antigravity-oauth` | Google Gemini (Antigravity) | Antigravity Token | Internal debugging (Gemini models) |
20+
| `openrouter` | OpenRouter | API Key | Access to many hosted models |
21+
| `zenmux` | ZenMux | API Key | OpenAI-compatible ZenMux router |
22+
| `zai` | ZAI | API Key | Zhipu/Z.ai access |
23+
| `zai-coding-plan` | ZAI Coding Plan | API Key | Coding-specific workflows |
24+
| `kimi-code` | Kimi | API Key | Kimi For Coding (OpenAI-compatible) |
25+
| `minimax` | Minimax | API Key | Minimax AI models |
26+
| `qwen-oauth` | Alibaba Qwen | Local OAuth token | Qwen CLI OAuth |
27+
| `qwen-oauth` | Alibaba Qwen | Local OAuth token | Qwen CLI OAuth |
28+
| `hybrid` | Virtual (orchestrates two models) | Inherits from sub-backends | Two-phase reasoning + execution |
29+
| `antigravity-oauth` | Google Gemini (Antigravity) | Antigravity Token | Internal debugging (Gemini models) |
2930

3031
## Frontend APIs
3132

@@ -59,10 +60,11 @@ Backends are configured through environment variables and the proxy configuratio
5960
export OPENAI_API_KEY="sk-..."
6061
export ANTHROPIC_API_KEY="sk-ant-..."
6162
export GEMINI_API_KEY="AIza..."
62-
export OPENROUTER_API_KEY="sk-or-..."
63-
export ZENMUX_API_KEY="..."
64-
export ZAI_API_KEY="..."
65-
export MINIMAX_API_KEY="..."
63+
export OPENROUTER_API_KEY="sk-or-..."
64+
export ZENMUX_API_KEY="..."
65+
export ZAI_API_KEY="..."
66+
export KIMI_API_KEY="..."
67+
export MINIMAX_API_KEY="..."
6668

6769
# For GCP-based Gemini
6870
export GOOGLE_CLOUD_PROJECT="your-project-id"
@@ -120,8 +122,9 @@ For detailed configuration and usage information for each backend, see:
120122
- [ZAI Backend](zai.md)
121123
- [Qwen Backend](qwen.md)
122124
- [MiniMax Backend](minimax.md)
123-
- [ZenMux Backend](zenmux.md)
124-
- [Custom Backends](custom-backends.md)
125+
- [ZenMux Backend](zenmux.md)
126+
- [Kimi Code Backend](kimi-code.md)
127+
- [Custom Backends](custom-backends.md)
125128

126129
## Related Features
127130

docs/user_guide/index.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,13 +110,15 @@ Backend provider configuration and usage:
110110
- **[Antigravity OAuth Backend](backends/antigravity-oauth.md)** - Internal Antigravity OAuth configuration
111111
- **[Kiro OAuth Auto Backend](backends/kiro-oauth-auto.md)** - Amazon Kiro / Q Developer streaming via self-managed OAuth
112112

113+
- **[Kimi Code Backend](backends/kimi-code.md)** - Kimi For Coding via OpenAI-compatible API
114+
113115
- **[OpenRouter Backend](backends/openrouter.md)** - OpenRouter multi-model access
114116
- **[ZAI Backend](backends/zai.md)** - Zhipu/Z.ai configuration
115117
- **[Qwen Backend](backends/qwen.md)** - Alibaba Qwen OAuth configuration
116118
- **[Minimax Backend](backends/minimax.md)** - Minimax API configuration
117119
- **[Zenmux Backend](backends/zenmux.md)** - Zenmux API configuration
118120
- **[OpenCode Zen Backend](backends/opencode-zen.md)** - OpenCode Zen API configuration
119-
- **[Custom Backends](backends/custom-backends.md)** - Creating and configuring custom backend connectors
121+
- **[Custom Backends](backends/custom-backends.md)** - Creating and configuring custom backend connectors
120122

121123
## Debugging
122124

src/connectors/gemini_base/config.py

Lines changed: 4 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -29,22 +29,13 @@
2929

3030
# Default available models for fallback
3131
DEFAULT_AVAILABLE_MODELS = [
32-
# Current generation (2.5 series) - DEFAULT models
33-
"gemini-2.5-pro",
32+
# Current generation (3.x series)
33+
"gemini-3-pro-preview",
3434
"gemini-3-flash-preview",
35+
# 2.5 series
36+
"gemini-2.5-pro",
3537
"gemini-2.5-flash",
3638
"gemini-2.5-flash-lite",
37-
# Preview models
38-
"gemini-2.5-pro-preview-05-06",
39-
"gemini-2.5-pro-preview-06-05",
40-
"gemini-2.5-flash-preview-05-20",
41-
# 2.0 series
42-
"gemini-2.0-flash",
43-
"gemini-2.0-flash-thinking-exp-1219",
44-
"gemini-2.0-flash-preview-image-generation",
45-
# 1.5 series
46-
"gemini-1.5-pro",
47-
"gemini-1.5-flash",
4839
# Embedding model
4940
"gemini-embedding-001",
5041
]

src/connectors/gemini_base/connector.py

Lines changed: 4 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1166,21 +1166,13 @@ async def _ensure_models_loaded(self) -> None:
11661166
if not self.available_models:
11671167
# Use a hardcoded list based on gemini-cli's tokenLimits.ts and models.ts
11681168
self.available_models = [
1169-
# Current generation (2.5 series) - DEFAULT models
1169+
# Current generation (3.x series)
1170+
"gemini-3-pro-preview",
1171+
"gemini-3-flash-preview",
1172+
# 2.5 series
11701173
"gemini-2.5-pro",
11711174
"gemini-2.5-flash",
11721175
"gemini-2.5-flash-lite",
1173-
# Preview models
1174-
"gemini-2.5-pro-preview-05-06",
1175-
"gemini-2.5-pro-preview-06-05",
1176-
"gemini-2.5-flash-preview-05-20",
1177-
# 2.0 series
1178-
"gemini-2.0-flash",
1179-
"gemini-2.0-flash-thinking-exp-1219",
1180-
"gemini-2.0-flash-preview-image-generation",
1181-
# 1.5 series
1182-
"gemini-1.5-pro",
1183-
"gemini-1.5-flash",
11841176
# Embedding model
11851177
"gemini-embedding-001",
11861178
]

src/connectors/gemini_base/model_validation.py

Lines changed: 4 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -195,21 +195,13 @@ class ModelListManager:
195195

196196
# Default fallback model list
197197
DEFAULT_MODELS: list[str] = [
198-
# Current generation (2.5 series) - DEFAULT models
198+
# Current generation (3.x series)
199+
"gemini-3-pro-preview",
200+
"gemini-3-flash-preview",
201+
# 2.5 series
199202
"gemini-2.5-pro",
200203
"gemini-2.5-flash",
201204
"gemini-2.5-flash-lite",
202-
# Preview models
203-
"gemini-2.5-pro-preview-05-06",
204-
"gemini-2.5-pro-preview-06-05",
205-
"gemini-2.5-flash-preview-05-20",
206-
# 2.0 series
207-
"gemini-2.0-flash",
208-
"gemini-2.0-flash-thinking-exp-1219",
209-
"gemini-2.0-flash-preview-image-generation",
210-
# 1.5 series
211-
"gemini-1.5-pro",
212-
"gemini-1.5-flash",
213205
# Embedding model
214206
"gemini-embedding-001",
215207
]

0 commit comments

Comments
 (0)