mirror of
https://github.com/snailyp/gemini-balance.git
synced 2026-05-07 05:02:48 +08:00
feat: 添加TTS相关配置和功能
- 在.env.example中添加TTS模型、语音名称和语速的配置选项 - 更新README文件,增加TTS相关配置的说明 - 在配置类中添加TTS相关设置 - 新增TTS请求模型以支持文本转语音功能 - 更新智能路由中间件以支持音频请求 - 在路由中添加处理TTS请求的API接口 - 更新前端配置编辑器以支持TTS配置选项
This commit is contained in:
@@ -74,3 +74,8 @@ FAKE_STREAM_EMPTY_DATA_INTERVAL_SECONDS=5
|
||||
# 安全设置 (JSON 字符串格式)
|
||||
# 注意:这里的示例值可能需要根据实际模型支持情况调整
|
||||
SAFETY_SETTINGS=[{"category": "HARM_CATEGORY_HARASSMENT", "threshold": "OFF"}, {"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "OFF"}, {"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "OFF"}, {"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "OFF"}, {"category": "HARM_CATEGORY_CIVIC_INTEGRITY", "threshold": "BLOCK_NONE"}]
|
||||
URL_NORMALIZATION_ENABLED=false
|
||||
# tts配置
|
||||
TTS_MODEL=gemini-2.5-flash-preview-tts
|
||||
TTS_VOICE_NAME=Zephyr
|
||||
TTS_SPEED=normal
|
||||
121
README.md
121
README.md
@@ -11,7 +11,7 @@
|
||||
[](https://www.uvicorn.org/)
|
||||
[](https://t.me/+soaHax5lyI0wZDVl)
|
||||
|
||||
> Telegram Group: https://t.me/+soaHax5lyI0wZDVl
|
||||
> Telegram Group: <https://t.me/+soaHax5lyI0wZDVl>
|
||||
|
||||
## Project Introduction
|
||||
|
||||
@@ -40,39 +40,39 @@ app/
|
||||
|
||||
## ✨ Feature Highlights
|
||||
|
||||
* **Multi-Key Load Balancing**: Supports configuring multiple Gemini API Keys (`API_KEYS`) for automatic sequential polling, improving availability and concurrency.
|
||||
* **Visual Configuration Takes Effect Immediately**: Configurations modified through the admin backend take effect without restarting the service. Remember to click save for changes to apply.
|
||||
* **Multi-Key Load Balancing**: Supports configuring multiple Gemini API Keys (`API_KEYS`) for automatic sequential polling, improving availability and concurrency.
|
||||
* **Visual Configuration Takes Effect Immediately**: Configurations modified through the admin backend take effect without restarting the service. Remember to click save for changes to apply.
|
||||

|
||||
* **Dual Protocol API Compatibility**: Supports forwarding CHAT API requests in both Gemini and OpenAI formats.
|
||||
* **Dual Protocol API Compatibility**: Supports forwarding CHAT API requests in both Gemini and OpenAI formats.
|
||||
|
||||
```plaintext
|
||||
openai baseurl `http://localhost:8000(/hf)/v1`
|
||||
gemini baseurl `http://localhost:8000(/gemini)/v1beta`
|
||||
```
|
||||
|
||||
* **Supports Image-Text Chat and Image Modification**: `IMAGE_MODELS` configures which models can perform image-text chat and image editing. When actually calling, use the `configured_model-image` model name to use this feature.
|
||||
* **Supports Image-Text Chat and Image Modification**: `IMAGE_MODELS` configures which models can perform image-text chat and image editing. When actually calling, use the `configured_model-image` model name to use this feature.
|
||||

|
||||

|
||||
* **Supports Web Search**: Supports web search. `SEARCH_MODELS` configures which models can perform web searches. When actually calling, use the `configured_model-search` model name to use this feature.
|
||||
* **Supports Web Search**: Supports web search. `SEARCH_MODELS` configures which models can perform web searches. When actually calling, use the `configured_model-search` model name to use this feature.
|
||||

|
||||
* **Key Status Monitoring**: Provides a `/keys_status` page (requires authentication) to view the status and usage of each Key in real-time.
|
||||
* **Key Status Monitoring**: Provides a `/keys_status` page (requires authentication) to view the status and usage of each Key in real-time.
|
||||

|
||||
* **Detailed Logging**: Provides detailed error logs for easy troubleshooting.
|
||||
* **Detailed Logging**: Provides detailed error logs for easy troubleshooting.
|
||||

|
||||

|
||||

|
||||
* **Support for Custom Gemini Proxy**: Supports custom Gemini proxies, such as those built on Deno or Cloudflare.
|
||||
* **OpenAI Image Generation API Compatibility**: Adapts the `imagen-3.0-generate-002` model interface to be compatible with the OpenAI image generation API, supporting client calls.
|
||||
* **Flexible Key Addition**: Flexible way to add keys using regex matching for `gemini_key`, with key deduplication.
|
||||
* **Support for Custom Gemini Proxy**: Supports custom Gemini proxies, such as those built on Deno or Cloudflare.
|
||||
* **OpenAI Image Generation API Compatibility**: Adapts the `imagen-3.0-generate-002` model interface to be compatible with the OpenAI image generation API, supporting client calls.
|
||||
* **Flexible Key Addition**: Flexible way to add keys using regex matching for `gemini_key`, with key deduplication.
|
||||

|
||||
* **OpenAI Format Embeddings API Compatibility**: Perfectly adapts to the OpenAI format `embeddings` interface, usable for local document vectorization.
|
||||
* **Streamlined Response Optimization**: Optional stream output optimizer (`STREAM_OPTIMIZER_ENABLED`) to improve the experience of long-text stream responses.
|
||||
* **Failure Retry and Key Management**: Automatically handles API request failures, retries (`MAX_RETRIES`), automatically disables Keys after too many failures (`MAX_FAILURES`), and periodically checks for recovery (`CHECK_INTERVAL_HOURS`).
|
||||
* **Docker Support**: Supports AMD and ARM architecture Docker deployments. You can also build your own Docker image.
|
||||
* **OpenAI Format Embeddings API Compatibility**: Perfectly adapts to the OpenAI format `embeddings` interface, usable for local document vectorization.
|
||||
* **Streamlined Response Optimization**: Optional stream output optimizer (`STREAM_OPTIMIZER_ENABLED`) to improve the experience of long-text stream responses.
|
||||
* **Failure Retry and Key Management**: Automatically handles API request failures, retries (`MAX_RETRIES`), automatically disables Keys after too many failures (`MAX_FAILURES`), and periodically checks for recovery (`CHECK_INTERVAL_HOURS`).
|
||||
* **Docker Support**: Supports AMD and ARM architecture Docker deployments. You can also build your own Docker image.
|
||||
> Image address: docker pull ghcr.io/snailyp/gemini-balance:latest
|
||||
* **Automatic Model List Maintenance**: Supports fetching OpenAI and Gemini model lists, perfectly compatible with NewAPI's automatic model list fetching, no manual entry required.
|
||||
* **Support for Removing Unused Models**: Too many default models are provided, many of which are not used. You can filter them out using `FILTERED_MODELS`.
|
||||
* **Proxy Support**: Supports configuring HTTP/SOCKS5 proxy servers (`PROXIES`) for accessing the Gemini API, convenient for use in special network environments. Supports batch adding proxies.
|
||||
* **Automatic Model List Maintenance**: Supports fetching OpenAI and Gemini model lists, perfectly compatible with NewAPI's automatic model list fetching, no manual entry required.
|
||||
* **Support for Removing Unused Models**: Too many default models are provided, many of which are not used. You can filter them out using `FILTERED_MODELS`.
|
||||
* **Proxy Support**: Supports configuring HTTP/SOCKS5 proxy servers (`PROXIES`) for accessing the Gemini API, convenient for use in special network environments. Supports batch adding proxies.
|
||||
|
||||
## 🚀 Quick Start
|
||||
|
||||
@@ -80,79 +80,83 @@ app/
|
||||
|
||||
#### a) Build with Dockerfile
|
||||
|
||||
1. **Build Image**:
|
||||
1. **Build Image**:
|
||||
|
||||
```bash
|
||||
docker build -t gemini-balance .
|
||||
```
|
||||
|
||||
2. **Run Container**:
|
||||
2. **Run Container**:
|
||||
|
||||
```bash
|
||||
docker run -d -p 8000:8000 --env-file .env gemini-balance
|
||||
```
|
||||
|
||||
* `-d`: Run in detached mode.
|
||||
* `-p 8000:8000`: Map port 8000 of the container to port 8000 of the host.
|
||||
* `--env-file .env`: Use the `.env` file to set environment variables.
|
||||
* `-d`: Run in detached mode.
|
||||
* `-p 8000:8000`: Map port 8000 of the container to port 8000 of the host.
|
||||
* `--env-file .env`: Use the `.env` file to set environment variables.
|
||||
|
||||
> Note: If using an SQLite database, you need to mount a data volume to persist
|
||||
> Note: If using an SQLite database, you need to mount a data volume to persist
|
||||
>
|
||||
> ```bash
|
||||
> docker run -d -p 8000:8000 --env-file .env -v /path/to/data:/app/data gemini-balance
|
||||
> ```
|
||||
>
|
||||
> Where `/path/to/data` is the data storage path on the host, and `/app/data` is the data directory inside the container.
|
||||
|
||||
#### b) Deploy with an Existing Docker Image
|
||||
|
||||
1. **Pull Image**:
|
||||
1. **Pull Image**:
|
||||
|
||||
```bash
|
||||
docker pull ghcr.io/snailyp/gemini-balance:latest
|
||||
```
|
||||
|
||||
2. **Run Container**:
|
||||
2. **Run Container**:
|
||||
|
||||
```bash
|
||||
docker run -d -p 8000:8000 --env-file .env ghcr.io/snailyp/gemini-balance:latest
|
||||
```
|
||||
|
||||
* `-d`: Run in detached mode.
|
||||
* `-p 8000:8000`: Map port 8000 of the container to port 8000 of the host (adjust as needed).
|
||||
* `--env-file .env`: Use the `.env` file to set environment variables (ensure the `.env` file exists in the directory where the command is executed).
|
||||
* `-d`: Run in detached mode.
|
||||
* `-p 8000:8000`: Map port 8000 of the container to port 8000 of the host (adjust as needed).
|
||||
* `--env-file .env`: Use the `.env` file to set environment variables (ensure the `.env` file exists in the directory where the command is executed).
|
||||
|
||||
> Note: If using an SQLite database, you need to mount a data volume to persist
|
||||
> Note: If using an SQLite database, you need to mount a data volume to persist
|
||||
>
|
||||
> ```bash
|
||||
> docker run -d -p 8000:8000 --env-file .env -v /path/to/data:/app/data ghcr.io/snailyp/gemini-balance:latest
|
||||
> ```
|
||||
>
|
||||
> Where `/path/to/data` is the data storage path on the host, and `/app/data` is the data directory inside the container.
|
||||
|
||||
### Run Locally (Suitable for Development and Testing)
|
||||
|
||||
If you want to run the source code directly locally for development or testing, follow these steps:
|
||||
|
||||
1. **Ensure Prerequisites are Met**:
|
||||
* Clone the repository locally.
|
||||
* Install Python 3.9 or higher.
|
||||
* Create and configure the `.env` file in the project root directory (refer to the "Configure Environment Variables" section above).
|
||||
* Install project dependencies:
|
||||
1. **Ensure Prerequisites are Met**:
|
||||
* Clone the repository locally.
|
||||
* Install Python 3.9 or higher.
|
||||
* Create and configure the `.env` file in the project root directory (refer to the "Configure Environment Variables" section above).
|
||||
* Install project dependencies:
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
2. **Start Application**:
|
||||
2. **Start Application**:
|
||||
Run the following command in the project root directory:
|
||||
|
||||
```bash
|
||||
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
|
||||
```
|
||||
|
||||
* `app.main:app`: Specifies the location of the FastAPI application instance (the `app` object in the `main.py` file within the `app` module).
|
||||
* `--host 0.0.0.0`: Makes the application accessible from any IP address on the local network.
|
||||
* `--port 8000`: Specifies the port number the application listens on (you can change this as needed).
|
||||
* `--reload`: Enables automatic reloading. When you modify the code, the service will automatically restart, which is very suitable for development environments (remove this option in production environments).
|
||||
* `app.main:app`: Specifies the location of the FastAPI application instance (the `app` object in the `main.py` file within the `app` module).
|
||||
* `--host 0.0.0.0`: Makes the application accessible from any IP address on the local network.
|
||||
* `--port 8000`: Specifies the port number the application listens on (you can change this as needed).
|
||||
* `--reload`: Enables automatic reloading. When you modify the code, the service will automatically restart, which is very suitable for development environments (remove this option in production environments).
|
||||
|
||||
3. **Access Application**:
|
||||
3. **Access Application**:
|
||||
After the application starts, you can access `http://localhost:8000` (or the host and port you specified) through a browser or API tool.
|
||||
|
||||
### Complete Configuration List
|
||||
@@ -181,6 +185,7 @@ If you want to run the source code directly locally for development or testing,
|
||||
| `SHOW_THINKING_PROCESS` | Optional, whether to display the model's thinking process | `true` |
|
||||
| `THINKING_MODELS` | Optional, list of models that support thinking functions | `[]` |
|
||||
| `THINKING_BUDGET_MAP` | Optional, thinking function budget mapping (model_name:budget_value) | `{}` |
|
||||
| `URL_NORMALIZATION_ENABLED` | Optional, whether to enable intelligent URL routing mapping | `false` |
|
||||
| `BASE_URL` | Optional, Gemini API base URL, no modification needed by default | `https://generativelanguage.googleapis.com/v1beta` |
|
||||
| `MAX_FAILURES` | Optional, number of times a single key is allowed to fail | `3` |
|
||||
| `MAX_RETRIES` | Optional, maximum number of retries for failed API requests | `3` |
|
||||
@@ -194,6 +199,10 @@ If you want to run the source code directly locally for development or testing,
|
||||
| `AUTO_DELETE_REQUEST_LOGS_ENABLED`| Optional, whether to enable automatic deletion of request logs | `false` |
|
||||
| `AUTO_DELETE_REQUEST_LOGS_DAYS` | Optional, automatically delete request logs older than this many days (e.g., 1, 7, 30) | `30` |
|
||||
| `SAFETY_SETTINGS` | Optional, safety settings (JSON string format), used to configure content safety thresholds. Example values may need adjustment based on actual model support. | `[{"category": "HARM_CATEGORY_HARASSMENT", "threshold": "OFF"}, {"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "OFF"}, {"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "OFF"}, {"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "OFF"}, {"category": "HARM_CATEGORY_CIVIC_INTEGRITY", "threshold": "BLOCK_NONE"}]` |
|
||||
| **TTS Related** | | |
|
||||
| `TTS_MODEL` | Optional, TTS model name | `gemini-2.5-flash-preview-tts` |
|
||||
| `TTS_VOICE_NAME` | Optional, TTS voice name | `Zephyr` |
|
||||
| `TTS_SPEED` | Optional, TTS speed | `normal` |
|
||||
| **Image Generation Related** | | |
|
||||
| `PAID_KEY` | Optional, paid API Key for advanced features like image generation | `your-paid-api-key` |
|
||||
| `CREATE_IMAGE_MODEL` | Optional, image generation model | `imagen-3.0-generate-002` |
|
||||
@@ -219,20 +228,20 @@ The following are the main API endpoints provided by the service:
|
||||
|
||||
### Gemini API Related (`(/gemini)/v1beta`)
|
||||
|
||||
* `GET /models`: List available Gemini models.
|
||||
* `POST /models/{model_name}:generateContent`: Generate content using the specified Gemini model.
|
||||
* `POST /models/{model_name}:streamGenerateContent`: Stream content generation using the specified Gemini model.
|
||||
* `GET /models`: List available Gemini models.
|
||||
* `POST /models/{model_name}:generateContent`: Generate content using the specified Gemini model.
|
||||
* `POST /models/{model_name}:streamGenerateContent`: Stream content generation using the specified Gemini model.
|
||||
|
||||
### OpenAI API Related
|
||||
|
||||
* `GET (/hf)/v1/models`: List available models (uses Gemini format underneath).
|
||||
* `POST (/hf)/v1/chat/completions`: Perform chat completion (uses Gemini format underneath, supports streaming).
|
||||
* `POST (/hf)/v1/embeddings`: Create text embeddings (uses Gemini format underneath).
|
||||
* `POST (/hf)/v1/images/generations`: Generate images (uses Gemini format underneath).
|
||||
* `GET /openai/v1/models`: List available models (uses OpenAI format underneath).
|
||||
* `POST /openai/v1/chat/completions`: Perform chat completion (uses OpenAI format underneath, supports streaming, can prevent truncation, and is faster).
|
||||
* `POST /openai/v1/embeddings`: Create text embeddings (uses OpenAI format underneath).
|
||||
* `POST /openai/v1/images/generations`: Generate images (uses OpenAI format underneath).
|
||||
* `GET (/hf)/v1/models`: List available models (uses Gemini format underneath).
|
||||
* `POST (/hf)/v1/chat/completions`: Perform chat completion (uses Gemini format underneath, supports streaming).
|
||||
* `POST (/hf)/v1/embeddings`: Create text embeddings (uses Gemini format underneath).
|
||||
* `POST (/hf)/v1/images/generations`: Generate images (uses Gemini format underneath).
|
||||
* `GET /openai/v1/models`: List available models (uses OpenAI format underneath).
|
||||
* `POST /openai/v1/chat/completions`: Perform chat completion (uses OpenAI format underneath, supports streaming, can prevent truncation, and is faster).
|
||||
* `POST /openai/v1/embeddings`: Create text embeddings (uses OpenAI format underneath).
|
||||
* `POST /openai/v1/images/generations`: Generate images (uses OpenAI format underneath).
|
||||
|
||||
## 🤝 Contributing
|
||||
|
||||
@@ -242,9 +251,9 @@ Pull Requests or Issues are welcome.
|
||||
|
||||
Special thanks to the following projects and platforms for providing image hosting services for this project:
|
||||
|
||||
* [PicGo](https://www.picgo.net/)
|
||||
* [SM.MS](https://smms.app/)
|
||||
* [CloudFlare-ImgBed](https://github.com/MarSeventh/CloudFlare-ImgBed) open source project
|
||||
* [PicGo](https://www.picgo.net/)
|
||||
* [SM.MS](https://smms.app/)
|
||||
* [CloudFlare-ImgBed](https://github.com/MarSeventh/CloudFlare-ImgBed) open source project
|
||||
|
||||
## 🙏 Thanks to Contributors
|
||||
|
||||
@@ -266,7 +275,7 @@ CDN acceleration and security protection for this project are sponsored by Tence
|
||||
|
||||
## 💖 Friendly Projects
|
||||
|
||||
* **[OneLine](https://github.com/chengtx809/OneLine)** by [chengtx809](https://github.com/chengtx809) - OneLine: AI-driven hot event timeline generation tool
|
||||
* **[OneLine](https://github.com/chengtx809/OneLine)** by [chengtx809](https://github.com/chengtx809) - OneLine: AI-driven hot event timeline generation tool
|
||||
|
||||
## 🎁 Project Support
|
||||
|
||||
|
||||
@@ -178,6 +178,7 @@ app/
|
||||
| `SHOW_THINKING_PROCESS` | 可选,是否显示模型思考过程 | `true` |
|
||||
| `THINKING_MODELS` | 可选,支持思考功能的模型列表 | `[]` |
|
||||
| `THINKING_BUDGET_MAP` | 可选,思考功能预算映射 (模型名:预算值) | `{}` |
|
||||
| `URL_NORMALIZATION_ENABLED` | 可选,是否启用智能路由映射功能 | `false` |
|
||||
| `BASE_URL` | 可选,Gemini API 基础 URL,默认无需修改 | `https://generativelanguage.googleapis.com/v1beta` |
|
||||
| `MAX_FAILURES` | 可选,允许单个key失败的次数 | `3` |
|
||||
| `MAX_RETRIES` | 可选,API 请求失败时的最大重试次数 | `3` |
|
||||
@@ -191,6 +192,10 @@ app/
|
||||
| `AUTO_DELETE_REQUEST_LOGS_ENABLED`| 可选,是否开启自动删除请求日志 | `false` |
|
||||
| `AUTO_DELETE_REQUEST_LOGS_DAYS` | 可选,自动删除多少天前的请求日志 (例如 1, 7, 30) | `30` |
|
||||
| `SAFETY_SETTINGS` | 可选,安全设置 (JSON 字符串格式),用于配置内容安全阈值。示例值可能需要根据实际模型支持情况调整。 | `[{"category": "HARM_CATEGORY_HARASSMENT", "threshold": "OFF"}, {"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "OFF"}, {"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "OFF"}, {"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "OFF"}, {"category": "HARM_CATEGORY_CIVIC_INTEGRITY", "threshold": "BLOCK_NONE"}]` |
|
||||
| **TTS 相关** | | |
|
||||
| `TTS_MODEL` | 可选,TTS 模型名称 | `gemini-2.5-flash-preview-tts` |
|
||||
| `TTS_VOICE_NAME` | 可选,TTS 语音名称 | `Zephyr` |
|
||||
| `TTS_SPEED` | 可选,TTS 语速 | `normal` |
|
||||
| **图像生成相关** | | |
|
||||
| `PAID_KEY` | 可选,付费版API Key,用于图片生成等高级功能 | `your-paid-api-key` |
|
||||
| `CREATE_IMAGE_MODEL` | 可选,图片生成模型 | `imagen-3.0-generate-002` |
|
||||
|
||||
@@ -77,6 +77,11 @@ class Settings(BaseSettings):
|
||||
THINKING_MODELS: List[str] = []
|
||||
THINKING_BUDGET_MAP: Dict[str, float] = {}
|
||||
|
||||
# TTS相关配置
|
||||
TTS_MODEL: str = "gemini-2.5-flash-preview-tts"
|
||||
TTS_VOICE_NAME: str = "Zephyr"
|
||||
TTS_SPEED: str = "normal"
|
||||
|
||||
# 图像生成相关配置
|
||||
PAID_KEY: str = ""
|
||||
CREATE_IMAGE_MODEL: str = DEFAULT_CREATE_IMAGE_MODEL
|
||||
|
||||
@@ -33,3 +33,10 @@ class ImageGenerationRequest(BaseModel):
|
||||
quality: Optional[str] = None
|
||||
style: Optional[str] = None
|
||||
response_format: Optional[str] = "url"
|
||||
|
||||
|
||||
class TTSRequest(BaseModel):
|
||||
model: str = "gemini-2.5-flash-preview-tts"
|
||||
input: str
|
||||
voice: str = "Kore"
|
||||
response_format: Optional[str] = "wav"
|
||||
|
||||
@@ -67,9 +67,9 @@ class SmartRoutingMiddleware(BaseHTTPMiddleware):
|
||||
r"^/gemini/v1beta/models/[^/:]+:(generate|streamGenerate)Content$", # Gemini带前缀
|
||||
r"^/v1beta/models$", # Gemini模型列表
|
||||
r"^/gemini/v1beta/models$", # Gemini带前缀的模型列表
|
||||
r"^/v1/(chat/completions|models|embeddings|images/generations)$", # v1格式
|
||||
r"^/openai/v1/(chat/completions|models|embeddings|images/generations)$", # OpenAI格式
|
||||
r"^/hf/v1/(chat/completions|models|embeddings|images/generations)$", # HF格式
|
||||
r"^/v1/(chat/completions|models|embeddings|images/generations|audio/speech)$", # v1格式
|
||||
r"^/openai/v1/(chat/completions|models|embeddings|images/generations|audio/speech)$", # OpenAI格式
|
||||
r"^/hf/v1/(chat/completions|models|embeddings|images/generations|audio/speech)$", # HF格式
|
||||
r"^/vertex-express/v1beta/models/[^/:]+:(generate|streamGenerate)Content$", # Vertex Express Gemini格式
|
||||
r"^/vertex-express/v1beta/models$", # Vertex Express模型列表
|
||||
r"^/vertex-express/v1/(chat/completions|models|embeddings|images/generations)$", # Vertex Express OpenAI格式
|
||||
@@ -146,6 +146,8 @@ class SmartRoutingMiddleware(BaseHTTPMiddleware):
|
||||
return "/openai/v1/embeddings", {"type": "openai_embeddings"}
|
||||
elif "image" in path.lower():
|
||||
return "/openai/v1/images/generations", {"type": "openai_images"}
|
||||
elif "audio" in path.lower():
|
||||
return "/openai/v1/audio/speech", {"type": "openai_audio"}
|
||||
elif method == "GET":
|
||||
if "model" in path.lower():
|
||||
return "/openai/v1/models", {"type": "openai_models"}
|
||||
@@ -161,6 +163,8 @@ class SmartRoutingMiddleware(BaseHTTPMiddleware):
|
||||
return "/v1/embeddings", {"type": "v1_embeddings"}
|
||||
elif "image" in path.lower():
|
||||
return "/v1/images/generations", {"type": "v1_images"}
|
||||
elif "audio" in path.lower():
|
||||
return "/v1/audio/speech", {"type": "v1_audio"}
|
||||
elif method == "GET":
|
||||
if "model" in path.lower():
|
||||
return "/v1/models", {"type": "v1_models"}
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
from fastapi import APIRouter, Depends, HTTPException
|
||||
from fastapi import APIRouter, Depends, HTTPException, Response
|
||||
from fastapi.responses import StreamingResponse
|
||||
|
||||
from app.config.config import settings
|
||||
@@ -7,6 +7,7 @@ from app.domain.openai_models import (
|
||||
ChatRequest,
|
||||
EmbeddingRequest,
|
||||
ImageGenerationRequest,
|
||||
TTSRequest,
|
||||
)
|
||||
from app.handler.retry_handler import RetryHandler
|
||||
from app.handler.error_handler import handle_route_errors
|
||||
@@ -14,6 +15,7 @@ from app.log.logger import get_openai_logger
|
||||
from app.service.chat.openai_chat_service import OpenAIChatService
|
||||
from app.service.embedding.embedding_service import EmbeddingService
|
||||
from app.service.image.image_create_service import ImageCreateService
|
||||
from app.service.tts.tts_service import TTSService
|
||||
from app.service.key.key_manager import KeyManager, get_key_manager_instance
|
||||
from app.service.model.model_service import ModelService
|
||||
|
||||
@@ -24,6 +26,7 @@ security_service = SecurityService()
|
||||
model_service = ModelService()
|
||||
embedding_service = EmbeddingService()
|
||||
image_create_service = ImageCreateService()
|
||||
tts_service = TTSService()
|
||||
|
||||
|
||||
async def get_key_manager():
|
||||
@@ -41,6 +44,11 @@ async def get_openai_chat_service(key_manager: KeyManager = Depends(get_key_mana
|
||||
return OpenAIChatService(settings.BASE_URL, key_manager)
|
||||
|
||||
|
||||
async def get_tts_service():
|
||||
"""获取TTS服务实例"""
|
||||
return tts_service
|
||||
|
||||
|
||||
@router.get("/v1/models")
|
||||
@router.get("/hf/v1/models")
|
||||
async def list_models(
|
||||
@@ -147,3 +155,21 @@ async def get_keys_list(
|
||||
},
|
||||
"total": len(keys_status["valid_keys"]) + len(keys_status["invalid_keys"]),
|
||||
}
|
||||
|
||||
|
||||
@router.post("/v1/audio/speech")
|
||||
@router.post("/hf/v1/audio/speech")
|
||||
async def text_to_speech(
|
||||
request: TTSRequest,
|
||||
_=Depends(security_service.verify_authorization),
|
||||
api_key: str = Depends(get_next_working_key_wrapper),
|
||||
tts_service: TTSService = Depends(get_tts_service),
|
||||
):
|
||||
"""处理 OpenAI TTS 请求。"""
|
||||
operation_name = "text_to_speech"
|
||||
async with handle_route_errors(logger, operation_name):
|
||||
logger.info(f"Handling TTS request for model: {request.model}")
|
||||
logger.debug(f"Request: \n{request.model_dump_json(indent=2)}")
|
||||
logger.info(f"Using API key: {api_key}")
|
||||
audio_data = await tts_service.create_tts(request, api_key)
|
||||
return Response(content=audio_data, media_type="audio/wav")
|
||||
|
||||
94
app/service/tts/tts_service.py
Normal file
94
app/service/tts/tts_service.py
Normal file
@@ -0,0 +1,94 @@
|
||||
import datetime
|
||||
import io
|
||||
import re
|
||||
import time
|
||||
import wave
|
||||
from typing import Optional
|
||||
|
||||
from google import genai
|
||||
|
||||
from app.config.config import settings
|
||||
from app.database.services import add_error_log, add_request_log
|
||||
from app.domain.openai_models import TTSRequest
|
||||
from app.log.logger import get_openai_logger
|
||||
|
||||
logger = get_openai_logger()
|
||||
|
||||
|
||||
def _create_wav_file(audio_data: bytes) -> bytes:
|
||||
"""Creates a WAV file in memory from raw audio data."""
|
||||
with io.BytesIO() as wav_file:
|
||||
with wave.open(wav_file, "wb") as wf:
|
||||
wf.setnchannels(1) # Mono
|
||||
wf.setsampwidth(2) # 16-bit
|
||||
wf.setframerate(24000) # 24kHz sample rate
|
||||
wf.writeframes(audio_data)
|
||||
return wav_file.getvalue()
|
||||
|
||||
|
||||
class TTSService:
|
||||
async def create_tts(self, request: TTSRequest, api_key: str) -> Optional[bytes]:
|
||||
"""
|
||||
使用 Google Gemini SDK 创建音频。
|
||||
"""
|
||||
start_time = time.perf_counter()
|
||||
request_datetime = datetime.datetime.now()
|
||||
is_success = False
|
||||
status_code = None
|
||||
response = None
|
||||
error_log_msg = ""
|
||||
try:
|
||||
client = genai.Client(api_key=api_key)
|
||||
response =await client.aio.models.generate_content(
|
||||
model=settings.TTS_MODEL,
|
||||
contents=f"Speak in a {settings.TTS_SPEED} speed voice: {request.input}",
|
||||
config={
|
||||
"response_modalities": ["Audio"],
|
||||
"speech_config": {
|
||||
"voice_config": {
|
||||
"prebuilt_voice_config": {
|
||||
"voice_name": settings.TTS_VOICE_NAME
|
||||
}
|
||||
}
|
||||
},
|
||||
},
|
||||
)
|
||||
if (
|
||||
response.candidates
|
||||
and response.candidates[0].content.parts
|
||||
and response.candidates[0].content.parts[0].inline_data
|
||||
):
|
||||
raw_audio_data = response.candidates[0].content.parts[0].inline_data.data
|
||||
is_success = True
|
||||
status_code = 200
|
||||
return _create_wav_file(raw_audio_data)
|
||||
except Exception as e:
|
||||
is_success = False
|
||||
error_log_msg = f"Generic error: {e}"
|
||||
logger.error(f"An error occurred in TTSService: {error_log_msg}")
|
||||
match = re.search(r"status code (\d+)", str(e))
|
||||
if match:
|
||||
status_code = int(match.group(1))
|
||||
else:
|
||||
status_code = 500
|
||||
raise
|
||||
finally:
|
||||
end_time = time.perf_counter()
|
||||
latency_ms = int((end_time - start_time) * 1000)
|
||||
if not is_success:
|
||||
await add_error_log(
|
||||
gemini_key=api_key,
|
||||
model_name=settings.TTS_MODEL,
|
||||
error_type="google-tts",
|
||||
error_log=error_log_msg,
|
||||
error_code=status_code,
|
||||
request_msg=request.input
|
||||
)
|
||||
await add_request_log(
|
||||
model_name=settings.TTS_MODEL,
|
||||
api_key=api_key,
|
||||
is_success=is_success,
|
||||
status_code=status_code,
|
||||
latency_ms=latency_ms,
|
||||
request_time=request_datetime
|
||||
)
|
||||
@@ -745,6 +745,13 @@ endblock %} {% block head_extra_styles %}
|
||||
>
|
||||
模型配置
|
||||
</button>
|
||||
<button
|
||||
class="tab-btn px-5 py-2 rounded-full font-medium text-sm transition-all duration-200"
|
||||
data-tab="tts"
|
||||
style="background-color: #f8fafc !important; color: #64748b !important; border: 2px solid #e2e8f0 !important; box-shadow: 0 1px 3px 0 rgba(0, 0, 0, 0.1) !important; font-weight: 500 !important;"
|
||||
>
|
||||
TTS 配置
|
||||
</button>
|
||||
<button
|
||||
class="tab-btn px-5 py-2 rounded-full font-medium text-sm transition-all duration-200"
|
||||
data-tab="image"
|
||||
@@ -1370,11 +1377,97 @@ endblock %} {% block head_extra_styles %}
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- 图像生成相关配置 -->
|
||||
<div class="config-section" id="image-section">
|
||||
|
||||
<!-- TTS配置 -->
|
||||
<div class="config-section" id="tts-section">
|
||||
<h2
|
||||
class="text-xl font-bold mb-6 pb-3 border-b flex items-center gap-2 text-gray-800 border-violet-300 border-opacity-30"
|
||||
>
|
||||
<i class="fas fa-volume-up text-violet-400"></i> TTS 相关配置
|
||||
</h2>
|
||||
|
||||
<!-- TTS 模型 -->
|
||||
<div class="mb-6">
|
||||
<label for="TTS_MODEL" class="block font-semibold mb-2 text-gray-700"
|
||||
>TTS 模型</label
|
||||
>
|
||||
<select
|
||||
id="TTS_MODEL"
|
||||
name="TTS_MODEL"
|
||||
class="w-full px-4 py-3 rounded-lg form-select-themed"
|
||||
>
|
||||
<option value="gemini-2.5-flash-preview-tts">gemini-2.5-flash-preview-tts</option>
|
||||
<option value="gemini-2.5-pro-preview-tts">gemini-2.5-pro-preview-tts</option>
|
||||
</select>
|
||||
<small class="text-gray-500 mt-1 block">用于TTS的模型</small>
|
||||
</div>
|
||||
|
||||
<!-- TTS 语音名称 -->
|
||||
<div class="mb-6">
|
||||
<label for="TTS_VOICE_NAME" class="block font-semibold mb-2 text-gray-700"
|
||||
>TTS 语音名称</label
|
||||
>
|
||||
<select
|
||||
id="TTS_VOICE_NAME"
|
||||
name="TTS_VOICE_NAME"
|
||||
class="w-full px-4 py-3 rounded-lg form-select-themed"
|
||||
>
|
||||
<option value="Zephyr">Zephyr (明亮)</option>
|
||||
<option value="Puck">Puck (欢快)</option>
|
||||
<option value="Charon">Charon (信息丰富)</option>
|
||||
<option value="Kore">Kore (坚定)</option>
|
||||
<option value="Fenrir">Fenrir (易激动)</option>
|
||||
<option value="Leda">Leda (年轻)</option>
|
||||
<option value="Orus">Orus (坚定)</option>
|
||||
<option value="Aoede">Aoede (轻松)</option>
|
||||
<option value="Callirhoe">Callirhoe (随和)</option>
|
||||
<option value="Autonoe">Autonoe (明亮)</option>
|
||||
<option value="Enceladus">Enceladus (呼吸感)</option>
|
||||
<option value="Iapetus">Iapetus (清晰)</option>
|
||||
<option value="Umbriel">Umbriel (随和)</option>
|
||||
<option value="Algieba">Algieba (平滑)</option>
|
||||
<option value="Despina">Despina (平滑)</option>
|
||||
<option value="Erinome">Erinome (清晰)</option>
|
||||
<option value="Algenib">Algenib (沙哑)</option>
|
||||
<option value="Rasalgethi">Rasalgethi (信息丰富)</option>
|
||||
<option value="Laomedeia">Laomedeia (欢快)</option>
|
||||
<option value="Achernar">Achernar (轻柔)</option>
|
||||
<option value="Alnilam">Alnilam (坚定)</option>
|
||||
<option value="Schedar">Schedar (平稳)</option>
|
||||
<option value="Gacrux">Gacrux (成熟)</option>
|
||||
<option value="Pulcherrima">Pulcherrima (向前)</option>
|
||||
<option value="Achird">Achird (友好)</option>
|
||||
<option value="Zubenelgenubi">Zubenelgenubi (休闲)</option>
|
||||
<option value="Vindemiatrix">Vindemiatrix (温柔)</option>
|
||||
<option value="Sadachbia">Sadachbia (活泼)</option>
|
||||
<option value="Sadaltager">Sadaltager (博学)</option>
|
||||
<option value="Sulafat">Sulafat (温暖)</option>
|
||||
</select>
|
||||
<small class="text-gray-500 mt-1 block">TTS 的语音名称,控制风格、语调、口音和节奏</small>
|
||||
</div>
|
||||
|
||||
<!-- TTS 语速 -->
|
||||
<div class="mb-6">
|
||||
<label for="TTS_SPEED" class="block font-semibold mb-2 text-gray-700"
|
||||
>TTS 语速</label
|
||||
>
|
||||
<select
|
||||
id="TTS_SPEED"
|
||||
name="TTS_SPEED"
|
||||
class="w-full px-4 py-3 rounded-lg form-select-themed"
|
||||
>
|
||||
<option value="slow">慢</option>
|
||||
<option value="normal">正常</option>
|
||||
<option value="fast">快</option>
|
||||
</select>
|
||||
<small class="text-gray-500 mt-1 block">选择 TTS 的语速</small>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- 图像生成相关配置 -->
|
||||
<div class="config-section" id="image-section">
|
||||
<h2
|
||||
class="text-xl font-bold mb-6 pb-3 border-b flex items-center gap-2 text-gray-800 border-violet-300 border-opacity-30"
|
||||
>
|
||||
<i class="fas fa-image text-violet-400"></i> 图像生成配置
|
||||
</h2>
|
||||
@@ -1511,12 +1604,12 @@ endblock %} {% block head_extra_styles %}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- 流式输出优化器配置 -->
|
||||
<!-- 流式输出优化配置 -->
|
||||
<div class="config-section" id="stream-section">
|
||||
<h2
|
||||
class="text-xl font-bold mb-6 pb-3 border-b flex items-center gap-2 text-gray-800 border-violet-300 border-opacity-30"
|
||||
>
|
||||
<i class="fas fa-stream text-violet-400"></i> 流式输出优化器
|
||||
<i class="fas fa-stream text-violet-400"></i> 流式输出相关配置
|
||||
</h2>
|
||||
|
||||
<!-- 启用流式输出优化 -->
|
||||
|
||||
Reference in New Issue
Block a user