The upstream _parse_chat_history enforcement code uses a first_fc_seen
flag that only adds DUMMY_THOUGHT_SIGNATURE to the first function_call
without thought_signature. Parallel function calls (position 2+) remain
unpatched, causing Gemini API 400 errors for all Gemini 2.5+ models.
Additionally, _is_gemini_3_or_later only matches 'gemini-3', missing
Gemini 2.5 models entirely.
This patch:
1. Extends _is_gemini_3_or_later to also match gemini-2.5 models
2. Wraps _parse_chat_history to ensure ALL function_call parts in ALL
model messages have thought_signature (not just the first one)
Read the latest LLM connection settings when building runtime clients so Web updates take effect immediately instead of reusing module-import defaults.
Closes#5757
- Add compatibility patch for langchain-openai responses API to ensure system messages are extracted as top-level instructions, addressing Codex endpoint requirements.
- Update provider list: add Alibaba, Volcengine, and Tencent TokenHub; adjust SiliconFlow and MiniMax endpoints; refine provider ordering and model list strategies.
- Extend models.dev-only listing logic for providers lacking stable models.list endpoints.
- Increase models.dev cache TTL for improved efficiency.
- Add tests for openai responses API and streaming compatibility patches.
- Move LLMHelper and related logic from app.helper.llm to app.agent.llm.helper
- Update all imports to reference new LLMHelper location
- Introduce app/agent/llm/__init__.py for internal LLM adapter exports
- Add llm.py API router with endpoints for model listing, provider auth, and test calls
- Remove legacy LLM endpoints from system.py
- Update requirements for langchain-anthropic and anthropic
- Refactor test_llm_helper_testcall.py for async LLMHelper usage and new import paths