Commit Graph

246 Commits

Author SHA1 Message Date
lilong.129
32d2f7d27d fix: update input interaction endpoint in WDADriver to use new URL 2025-06-26 19:06:59 +08:00
lilong.129
1923996f65 fix: clean up Appium settings initialization 2025-06-26 17:30:00 +08:00
lilong.129
98dc80cca5 merge master 2025-06-26 14:57:56 +08:00
lilong.129
1b48763e57 refactor: clean up function signatures in DriverSession and WDADriver for improved readability 2025-06-26 14:43:39 +08:00
lilong.129
3e4858db44 refactor: update URL handling in DriverSession and WDADriver, removing base URL dependency for HTTP requests 2025-06-26 14:37:41 +08:00
lilong.129
88d255bce1 refactor: remove GET/POST methods with base URL, streamline HTTP requests in UIA2 and WDA drivers 2025-06-26 14:18:11 +08:00
lilong.129
51e106c9ad fix: improve JSON parsing in ForegroundInfo by cleaning packageInfo 2025-06-26 13:48:42 +08:00
lilong.129
90ce090e35 fix: remove redundant message cleaning logic in callModelWithLogging
The previous message cleaning logic was flawed:
- cleanedMsg.Content was already set to message.Content
- The condition checked if message.Content == "" then set cleanedMsg.Content = ""
- This was redundant since cleanedMsg.Content would already be empty

The real fix for the API 400 error is in planner.go where we ensure Tool messages
have non-empty content. The utils.go changes were unnecessary.
2025-06-26 13:41:39 +08:00
余泓铮
2cffdf01b1 fix: 解决单侧语法错误问题 2025-06-26 11:07:01 +08:00
余泓铮
552aa516b9 fix: 解决ui2不兼容的问题 2025-06-26 11:04:09 +08:00
lilong.129
e070801b00 fix: resolve doubao-seed-1.6-250615 model API 400 error with empty content
- Fix Tool message content issue when model returns empty content in function calling
- Add content validation in callModelWithLogging to handle empty content in messages
- Ensure compatibility between UI-TARS and function calling models

This resolves the "missing messages.content parameter" error when using 
doubao-seed-1.6-250615 model compared to doubao-1.5-ui-tars-250328
2025-06-26 10:57:27 +08:00
余泓铮
a31a8c67f0 feat: 优化代码结构 2025-06-26 00:46:07 +08:00
余泓铮
22d4f12114 feat: 重构session和wda,去掉session概念。session_driver改为纯http调用工具 2025-06-26 00:39:09 +08:00
lilong.129
9b9875dc2c feat: enhance HTML report with modals for summary JSON and log content, and improve download functionality 2025-06-25 15:30:32 +08:00
lilong.129
0491299245 merge master 2025-06-25 12:07:38 +08:00
lilong.129
70471d2fb4 fix: enhance logging for interrupted processes and ensure step results are saved in failfast mode 2025-06-25 11:57:09 +08:00
徐聪
470bde97d7 fix: failed to return err 2025-06-25 11:13:56 +08:00
徐聪
4f605d5558 fix: add actions to web driver 2025-06-24 23:17:05 +08:00
lilong.129
53fad4edc5 refactor: streamline AI assertion result handling by consolidating error management and improving result structure 2025-06-24 23:10:46 +08:00
徐聪
ea6b0a6902 fix: failed to exec web script 2025-06-24 23:02:35 +08:00
lilong.129
72a0915b04 fix: adb double tap 2025-06-24 22:54:26 +08:00
lilong.129
d0ceeb6c51 refactor: update AI result handling to differentiate content and thought based on result types in report generation 2025-06-24 16:01:50 +08:00
lilong.129
b1719344c0 feat: enhance AI result handling with model name and usage statistics for query, action, and assertion types 2025-06-24 15:25:12 +08:00
lilong.129
8fc8d06604 feat: unify AI action handling with detailed execution results and enhanced UI integration 2025-06-24 13:42:08 +08:00
lilong.129
fc32b5d874 feat: enhance AI query handling with detailed result structure and improved UI display 2025-06-24 11:50:37 +08:00
lilong.129
58befd6eae refactor: rename buildMCPCallToolRequest to BuildMCPCallToolRequest for consistency across the codebase 2025-06-22 22:54:12 +08:00
lilong.129
6cc3c3acb5 refactor: update driver caching mechanism to use generic CacheManager and improve metadata handling 2025-06-22 21:42:50 +08:00
lilong.129
e48bbb2271 change: remove unused code 2025-06-22 13:25:55 +08:00
lilong.129
a1c8b7fab3 refactor: remove unused handlers and related files to streamline the server codebase 2025-06-21 22:08:54 +08:00
lilong.129
d2031cb0f2 refactor: add context support to sleep functions for improved cancellation handling 2025-06-20 19:12:27 +08:00
lilong.129
0c9dac95a1 feat: enhance report generation by integrating session data and improving AI query display 2025-06-20 17:38:36 +08:00
lilong.129
ed5d3127cb fix: add missing action options 2025-06-19 21:57:26 +08:00
lilong.129
9e589dec16 feat: add initialization of nil fields in summary data to prevent template execution errors 2025-06-19 14:46:56 +08:00
lilong.129
e40db65287 feat: enhance report generation with new AI query and validation display features 2025-06-18 22:35:19 +08:00
lilong.129
a3f2ff37bc refactor: replace hardcoded log messages with constants for better maintainability 2025-06-18 17:17:29 +08:00
lilong.129
1f3366453e feat: implement structured response parsing with enhanced error recovery and UTF-8 sanitization 2025-06-18 16:59:35 +08:00
lilong.129
6965cf9fe9 refactor: enhance screenshot functionality with session saving and optional CV processing 2025-06-18 16:13:45 +08:00
lilong.129
a890981e2d fix: update StartTime to use UnixMilli for better precision across step functions 2025-06-18 13:51:44 +08:00
lilong.129
3d2707fa36 feat: add home and back action mappings to planner prompts 2025-06-18 12:01:53 +08:00
lilong.129
a78ba90d33 refactor: config results path 2025-06-15 23:31:36 +08:00
lilong.129
b271e655b1 feat: add MCP plugin support and optimize AI service configuration
- Add UIXT runner with MCP plugin support
   - Refactor AI service options handling
   - Optimize configuration parsing for LLM and CV services
   - Update dependencies to latest versions
2025-06-13 20:24:57 +08:00
lilong.129
409cd693f0 refactor: GetScreenshotBase64WithSize 2025-06-13 12:01:21 +08:00
lilong.129
f6e7e970f8 feat: 实现 AIQuery 功能并支持 OutputSchema
- 新增 AIQuery 方法到 StepMobile,支持使用自然语言从屏幕中提取信息
- 实现 AIQuery 在 driver_ext_ai.go 中的完整功能,包括屏幕截图和 LLM 查询
- 添加 OutputSchema 支持,允许用户定义自定义输出格式进行结构化查询
- 新增 ToolAIQuery MCP 工具,完整集成到 MCP 服务器中
- 在 ActionOptions 中添加 OutputSchema 字段和 WithOutputSchema 选项函数
- 添加 ACTION_Query 的配置支持和字段映射
- 完善测试覆盖:
  * 添加 TestAIQuery 单元测试,包含多种 OutputSchema 使用场景
  * 添加 TestToolAIQuery MCP 工具测试
  * 定义 GameInfo、UIElementInfo 等结构体用于测试
- 更新文档:
  * 在 docs/uixt/ai.md 中添加完整的 AIQuery 使用指南
  * 包含基本用法、OutputSchema 示例、最佳实践等
- 支持复杂的嵌套结构体和数组类型的 OutputSchema
- 与现有 AIAction、AIAssert 功能保持一致的 API 设计
2025-06-13 10:27:08 +08:00
lilong.129
72df285fed fix: get resultsPath 2025-06-12 14:51:15 +08:00
lilong.129
51ee639cac docs: update docs 2025-06-11 14:57:08 +08:00
lilong.129
fbc888655f feat: optimize ILLMService interface to support different models for each component
- Add LLMServiceConfig to support mixed model configuration
- Enable Planner, Asserter, Querier to use different optimal models
- Provide recommended configurations for various use cases
- Maintain backward compatibility with existing API
- Update documentation to reflect current state without iteration history
- Merge test files and add comprehensive configuration tests
- Resolve circular dependency by moving config to option package
2025-06-11 12:18:31 +08:00
lilong.129
50414ec74d fix(ai): 修复 OpenAI 结构化输出解析问题并重构代码结构
- 修复 OpenAI structured output 的 properties 包装层解析问题
- 重构 parseCustomSchemaResult 函数,提高代码可维护性:
  - 拆分为多个职责单一的小函数
  - 消除重复的字段提取逻辑
  - 采用清晰的策略模式处理不同解析场景
- 增强测试用例,添加具体的数值和结构验证
- 保持完全向后兼容,所有现有测试通过

Fixes: TestQueryFunctionality/ComprehensiveAnalysis 测试失败问题
2025-06-11 11:15:02 +08:00
lilong.129
caf75b087b fix: remove unneccessary tests 2025-06-10 22:52:52 +08:00
lilong.129
81a92ae155 docs: update AI module README with latest features
- Add comprehensive documentation for the new Query functionality
- Update interface method names from Call to Plan for consistency
- Add OpenAI GPT-4O model support documentation
- Include detailed usage examples for basic and custom schema queries
- Add configuration examples for multiple model services
- Document new features like ResetHistory, Usage statistics, and automatic type conversion
- Expand advanced features section with custom output format examples
- Update all code examples to reflect the latest API changes

The documentation now reflects the current state of the AI module with all three core capabilities:
- Planning (renamed from Call)
- Assertion
- Query (new feature)

All examples and configurations are updated to match the latest implementation.
2025-06-10 20:52:44 +08:00
lilong.129
c513e56d30 feat: add Query method to ILLMService interface
- Add Query method to ILLMService interface for unified AI service access
- Update combinedLLMService to include querier functionality
- Add comprehensive tests for ILLMService Query method
- Support both basic query and custom schema query through unified interface
- Add environment variable checks for test reliability

This allows users to access all AI capabilities (planning, assertion, and query) 
through a single ILLMService interface, providing better API consistency and ease of use.
2025-06-10 20:45:49 +08:00