Compare commits

..

9 Commits

Author SHA1 Message Date
snaily
cc36ba4c9e feat(config): 新增流式输出优化器开关配置
在环境变量示例文件(.env.example)和配置类(config.py)中新增 STREAM_OPTIMIZER_ENABLED 配置项,用于控制流式输出优化器的启用状态,默认设为 false

调整 Gemini 和 OpenAI 聊天服务的流式响应处理逻辑:
- 仅在流式优化器启用时(settings.STREAM_OPTIMIZER_ENABLED 为 true)
- 才会对文本内容执行流式输出优化处理
- 保持原有文本提取逻辑不变,仅增加配置条件判断

该变更使流式输出优化器变为可选功能,方便根据实际需求进行开关控制
2025-04-03 04:47:06 +08:00
snaily
baf643e884 feat: 新增请求超时配置及优化模型列表接口api_key获取方式
1. 新增功能:
   - 在`.env.example`中添加`TIME_OUT=300`配置项(包含中文注释)
   - 在`Settings`类中增加`TIME_OUT`字段(读取自`DEFAULT_TIMEOUT`)

2. 优化内容:
   - 生成配置:
     * 为`GenerationConfig`设置默认温度/TOP_P/TOP_K值
     * 移除`maxOutputTokens`默认值,改为可选传递
   - OpenAI请求:
     * 移除`max_tokens`默认值
     * 只有当`max_tokens`有值时才添加到请求payload
   - 日志优化:
     * 注释掉`stream_optimizer.py`中部分调试日志

3. 模型列表接口api_key获取方式
2025-04-03 03:12:59 +08:00
严浩
360bc9e48d feat(ci): 更新Docker发布工作流 2025-04-02 13:49:05 +08:00
snaily
c0a27d0542 Update README.md 2025-03-29 01:03:36 +08:00
snaily
84052a2179 feat(auth): 增强Gemini API的认证机制支持URL参数
- 将generate_content和stream_generate_content端点的认证依赖从verify_goog_api_key更改为verify_key_or_goog_api_key
- 使Gemini API同时支持URL参数中的key和请求头中的x-goog-api-key进行认证
- 提高API的灵活性,便于不同客户端集成
2025-03-28 23:44:40 +08:00
snaily
2e7ecd88b5 feat: 增强Gemini API tools参数处理
- 修改GeminiRequest模型,使tools字段支持单个工具对象或工具对象列表
- 在gemini_chat_service中添加类型转换逻辑,确保tools始终以列表形式处理
- 提高API的灵活性和兼容性
2025-03-28 20:50:01 +08:00
snaily
0b1f3dfc04 feat(auth): 支持x-goog-api-key请求头认证
- 添加verify_key_or_goog_api_key方法,支持同时验证URL参数中的key和请求头中的x-goog-api-key
- 更新models接口使用新的认证方法,提高与Google API客户端的兼容性
2025-03-28 19:27:42 +08:00
snaily
c691c7c1cf fix:当没有可用工具时返回空列表而非包含空字典的列表
在_build_tools函数中,当没有工具配置可用时(即tool为空字典),现在会返回空列表[]而不是[{}]。这个防御性编程修复可以避免向Gemini API发送无效的工具配置,防止可能的API调用错误。
2025-03-25 15:18:27 +08:00
snaily
97db7eebf1 chore:修改图片处理逻辑,统一使用base64编码
将_convert_image函数中对非data:image格式URL的处理方式从直接返回URL改为转换为base64编码的内联数据。这样无论图片是以data URI形式还是URL形式提供,都会统一转换为base64编码,确保与API交互时图片数据格式的一致性。
2025-03-25 13:23:17 +08:00
14 changed files with 87 additions and 36 deletions

View File

@@ -10,6 +10,8 @@ SHOW_SEARCH_LINK=true
SHOW_THINKING_PROCESS=true
BASE_URL=https://generativelanguage.googleapis.com/v1beta
MAX_FAILURES=10
# 请求超时时间(秒)
TIME_OUT=300
#########################image_generate 相关配置###########################
PAID_KEY=AIzaSyxxxxxxxxxxxxxxxxxxx
CREATE_IMAGE_MODEL=imagen-3.0-generate-002
@@ -20,6 +22,7 @@ CLOUDFLARE_IMGBED_URL=https://xxxxxxx.pages.dev/upload
CLOUDFLARE_IMGBED_AUTH_CODE=xxxxxxxxx
##########################################################################
#########################stream_optimizer 相关配置########################
STREAM_OPTIMIZER_ENABLED=false
STREAM_MIN_DELAY=0.016
STREAM_MAX_DELAY=0.024
STREAM_SHORT_TEXT_THRESHOLD=10

View File

@@ -2,8 +2,6 @@ name: Docker Image CI
on:
push:
# branches: [ "main" ]
tags: [ 'v*.*.*' ]
pull_request:
branches: [ "main" ]
@@ -43,20 +41,30 @@ jobs:
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=raw,value=latest,enable={{is_default_branch}}
# https://github.com/docker/metadata-action/tree/v5/?tab=readme-ov-file#semver
# Event: push, Ref: refs/head/main, Tags: main
# Event: push tag, Ref: refs/tags/v1.2.3, Tags: 1.2.3, 1.2, 1, latest
# Event: push tag, Ref: refs/tags/v2.0.8-rc1, Tags: 2.0.8-rc1
type=ref,event=branch
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=sha,format=long
type=semver,pattern={{major}}
labels: |
org.opencontainers.image.description=OpenAI API Compatible Server
org.opencontainers.image.source=${{ github.event.repository.html_url }}
- name: Build and push Docker image
uses: docker/build-push-action@v5
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Build and push
uses: docker/build-push-action@v6
with:
file: Dockerfile
context: .
platforms: linux/amd64,linux/arm64
push: ${{ github.event_name != 'pull_request' }}
load: false
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
cache-from: type=gha,scope=${{ github.workflow }}
cache-to: type=gha,scope=${{ github.workflow }}

View File

@@ -1,4 +1,4 @@
# 🚀 FastAPI OpenAI (Gemini) 代理服务
# 🚀 Gemini 代理服务支持openai/gemini格式
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

View File

@@ -4,7 +4,7 @@
from typing import List
from pydantic_settings import BaseSettings
from app.core.constants import API_VERSION, DEFAULT_CREATE_IMAGE_MODEL, DEFAULT_FILTER_MODELS, DEFAULT_MODEL, DEFAULT_STREAM_CHUNK_SIZE, DEFAULT_STREAM_LONG_TEXT_THRESHOLD, DEFAULT_STREAM_MAX_DELAY, DEFAULT_STREAM_MIN_DELAY, DEFAULT_STREAM_SHORT_TEXT_THRESHOLD
from app.core.constants import API_VERSION, DEFAULT_CREATE_IMAGE_MODEL, DEFAULT_FILTER_MODELS, DEFAULT_MODEL, DEFAULT_STREAM_CHUNK_SIZE, DEFAULT_STREAM_LONG_TEXT_THRESHOLD, DEFAULT_STREAM_MAX_DELAY, DEFAULT_STREAM_MIN_DELAY, DEFAULT_STREAM_SHORT_TEXT_THRESHOLD, DEFAULT_TIMEOUT
class Settings(BaseSettings):
@@ -16,6 +16,7 @@ class Settings(BaseSettings):
AUTH_TOKEN: str = ""
MAX_FAILURES: int = 3
TEST_MODEL: str = DEFAULT_MODEL
TIME_OUT: int = DEFAULT_TIMEOUT
# 模型相关配置
SEARCH_MODELS: List[str] = ["gemini-2.0-flash-exp"]
@@ -35,6 +36,7 @@ class Settings(BaseSettings):
CLOUDFLARE_IMGBED_AUTH_CODE: str = ""
# 流式输出优化器配置
STREAM_OPTIMIZER_ENABLED: bool = False
STREAM_MIN_DELAY: float = DEFAULT_STREAM_MIN_DELAY
STREAM_MAX_DELAY: float = DEFAULT_STREAM_MAX_DELAY
STREAM_SHORT_TEXT_THRESHOLD: int = DEFAULT_STREAM_SHORT_TEXT_THRESHOLD

View File

@@ -72,3 +72,22 @@ class SecurityService:
raise HTTPException(status_code=401, detail="Invalid auth_token")
return token
async def verify_key_or_goog_api_key(
self, key: Optional[str] = None , x_goog_api_key: Optional[str] = Header(None)
) -> str:
"""验证URL中的key或请求头中的x-goog-api-key"""
# 如果URL中的key有效直接返回
if key in self.allowed_tokens or key == self.auth_token:
return key
# 否则检查请求头中的x-goog-api-key
if not x_goog_api_key:
logger.error("Invalid key and missing x-goog-api-key header")
raise HTTPException(status_code=401, detail="Invalid key and missing x-goog-api-key header")
if x_goog_api_key not in self.allowed_tokens and x_goog_api_key != self.auth_token:
logger.error("Invalid key and invalid x-goog-api-key")
raise HTTPException(status_code=401, detail="Invalid key and invalid x-goog-api-key")
return x_goog_api_key

View File

@@ -1,6 +1,8 @@
from typing import List, Optional, Dict, Any, Literal
from typing import List, Optional, Dict, Any, Literal, Union
from pydantic import BaseModel
from app.core.constants import DEFAULT_TEMPERATURE, DEFAULT_TOP_K, DEFAULT_TOP_P
class SafetySetting(BaseModel):
category: Optional[Literal["HARM_CATEGORY_HATE_SPEECH", "HARM_CATEGORY_DANGEROUS_CONTENT", "HARM_CATEGORY_HARASSMENT", "HARM_CATEGORY_SEXUALLY_EXPLICIT", "HARM_CATEGORY_CIVIC_INTEGRITY"]] = None
@@ -13,9 +15,9 @@ class GenerationConfig(BaseModel):
responseSchema: Optional[Dict[str, Any]] = None
candidateCount: Optional[int] = 1
maxOutputTokens: Optional[int] = None
temperature: Optional[float] = None
topP: Optional[float] = None
topK: Optional[int] = None
temperature: Optional[float] = DEFAULT_TEMPERATURE
topP: Optional[float] = DEFAULT_TOP_P
topK: Optional[int] = DEFAULT_TOP_K
presencePenalty: Optional[float] = None
frequencyPenalty: Optional[float] = None
responseLogprobs: Optional[bool] = None
@@ -34,7 +36,7 @@ class GeminiContent(BaseModel):
class GeminiRequest(BaseModel):
contents: List[GeminiContent] = []
tools: Optional[List[Dict[str, Any]]] = []
tools: Optional[Union[List[Dict[str, Any]], Dict[str, Any]]] = []
safetySettings: Optional[List[SafetySetting]] = None
generationConfig: Optional[GenerationConfig] = None
systemInstruction: Optional[SystemInstruction] = None

View File

@@ -1,7 +1,7 @@
from pydantic import BaseModel
from typing import List, Optional, Union
from app.core.constants import DEFAULT_MAX_TOKENS, DEFAULT_MODEL, DEFAULT_TEMPERATURE, DEFAULT_TOP_K, DEFAULT_TOP_P
from app.core.constants import DEFAULT_MODEL, DEFAULT_TEMPERATURE, DEFAULT_TOP_K, DEFAULT_TOP_P
class ChatRequest(BaseModel):
@@ -10,7 +10,7 @@ class ChatRequest(BaseModel):
temperature: Optional[float] = DEFAULT_TEMPERATURE
stream: Optional[bool] = False
tools: Optional[List[dict]] = []
max_tokens: Optional[int] = DEFAULT_MAX_TOKENS
max_tokens: Optional[int] = None
top_p: Optional[float] = DEFAULT_TOP_P
top_k: Optional[int] = DEFAULT_TOP_K
stop: Optional[List[str]] = []

View File

@@ -49,11 +49,14 @@ def _convert_image(image_url: str) -> Dict[str, Any]:
"data": encoded_data
}
}
return {
"image_url": {
"url": image_url
else:
encoded_data = _convert_image_to_base64(image_url)
return {
"inline_data": {
"mime_type": "image/png",
"data": encoded_data
}
}
}
def _convert_image_to_base64(url: str) -> str:

View File

@@ -107,15 +107,15 @@ class StreamOptimizer:
# 计算智能延迟时间
delay = self.calculate_delay(len(text))
if self.logger:
self.logger.info(f"Text length: {len(text)}, delay: {delay:.4f}s")
# if self.logger:
# self.logger.info(f"Text length: {len(text)}, delay: {delay:.4f}s")
# 根据文本长度决定输出方式
if len(text) >= self.long_text_threshold:
# 长文本:分块输出
chunks = self.split_text_into_chunks(text)
if self.logger:
self.logger.info(f"Long text: splitting into {len(chunks)} chunks")
# if self.logger:
# self.logger.info(f"Long text: splitting into {len(chunks)} chunks")
for chunk_text in chunks:
chunk_response = create_response_chunk(chunk_text)
yield format_chunk(chunk_response)

View File

@@ -34,14 +34,14 @@ async def get_next_working_key(key_manager: KeyManager = Depends(get_key_manager
@router.get("/models")
@router_v1beta.get("/models")
async def list_models(
_=Depends(security_service.verify_key),
_=Depends(security_service.verify_key_or_goog_api_key),
key_manager: KeyManager = Depends(get_key_manager)
):
"""获取可用的Gemini模型列表"""
logger.info("-" * 50 + "list_gemini_models" + "-" * 50)
logger.info("Handling Gemini models list request")
api_key = await key_manager.get_next_working_key()
api_key = await key_manager.get_first_valid_key()
logger.info(f"Using API key: {api_key}")
models_json = model_service.get_gemini_models(api_key)
@@ -86,7 +86,7 @@ async def list_models(
async def generate_content(
model_name: str,
request: GeminiRequest,
_=Depends(security_service.verify_goog_api_key),
_=Depends(security_service.verify_key_or_goog_api_key),
api_key: str = Depends(get_next_working_key),
key_manager: KeyManager = Depends(get_key_manager)
):
@@ -118,7 +118,7 @@ async def generate_content(
async def stream_generate_content(
model_name: str,
request: GeminiRequest,
_=Depends(security_service.verify_goog_api_key),
_=Depends(security_service.verify_key_or_goog_api_key),
api_key: str = Depends(get_next_working_key),
key_manager: KeyManager = Depends(get_key_manager)
):

View File

@@ -44,7 +44,7 @@ async def list_models(
):
logger.info("-" * 50 + "list_models" + "-" * 50)
logger.info("Handling models list request")
api_key = await key_manager.get_next_working_key()
api_key = await key_manager.get_first_valid_key()
logger.info(f"Using API key: {api_key}")
try:
return model_service.get_gemini_openai_models(api_key)

View File

@@ -44,6 +44,8 @@ def _build_tools(model: str, payload: Dict[str, Any]) -> List[Dict[str, Any]]:
tool = dict()
if payload and isinstance(payload, dict) and "tools" in payload:
if payload.get("tools") and isinstance(payload.get("tools"), dict):
payload["tools"] = [payload.get("tools")]
items = payload.get("tools", [])
if items and isinstance(items, list):
tool.update(_merge_tools(items))
@@ -62,7 +64,7 @@ def _build_tools(model: str, payload: Dict[str, Any]) -> List[Dict[str, Any]]:
tool.pop("googleSearch", None)
tool.pop("codeExecution", None)
return [tool]
return [tool] if tool else []
def _get_safety_settings(model: str) -> List[Dict[str, str]]:
@@ -87,6 +89,11 @@ def _get_safety_settings(model: str) -> List[Dict[str, str]]:
def _build_payload(model: str, request: GeminiRequest) -> Dict[str, Any]:
"""构建请求payload"""
request_dict = request.model_dump()
if request.generationConfig:
if request.generationConfig.maxOutputTokens is None:
# 如果未指定最大输出长度,则不传递该字段,解决截断的问题
request_dict["generationConfig"].pop("maxOutputTokens")
payload = {
"contents": request_dict.get("contents", []),
"tools": _build_tools(model, request_dict),
@@ -160,9 +167,8 @@ class GeminiChatService:
json.loads(line), model, stream=True
)
text = self._extract_text_from_response(response_data)
# 如果有文本内容,使用流式输出优化器处理
if text:
# 如果有文本内容,且开启了流式输出优化器,则使用流式输出优化器处理
if text and settings.STREAM_OPTIMIZER_ENABLED:
# 使用流式输出优化器处理文本输出
async for (
optimized_chunk

View File

@@ -115,7 +115,6 @@ def _build_payload(
"contents": messages,
"generationConfig": {
"temperature": request.temperature,
"maxOutputTokens": request.max_tokens,
"stopSequences": request.stop,
"topP": request.top_p,
"topK": request.top_k,
@@ -123,6 +122,8 @@ def _build_payload(
"tools": _build_tools(request, messages),
"safetySettings": _get_safety_settings(request.model),
}
if request.max_tokens is not None:
payload["generationConfig"]["maxOutputTokens"] = request.max_tokens
if request.model.endswith("-image") or request.model.endswith("-image-generation"):
payload["generationConfig"]["responseModalities"] = ["Text", "Image"]
@@ -214,7 +215,7 @@ class OpenAIChatService:
if openai_chunk:
# 提取文本内容
text = self._extract_text_from_openai_chunk(openai_chunk)
if text:
if text and settings.STREAM_OPTIMIZER_ENABLED:
# 使用流式输出优化器处理文本输出
async for (
optimized_chunk

View File

@@ -81,6 +81,13 @@ class KeyManager:
return {"valid_keys": valid_keys, "invalid_keys": invalid_keys}
async def get_first_valid_key(self) -> str:
"""获取第一个有效的API key"""
async with self.failure_count_lock:
for key in self.key_failure_counts:
if self.key_failure_counts[key] < self.MAX_FAILURES:
return key
return self.api_keys[0]
_singleton_instance = None
_singleton_lock = asyncio.Lock()