Commit Graph

69 Commits

Author SHA1 Message Date
lilong.129
c322d7c36c fix: improve JSON extraction to handle UTF-8 Chinese characters properly
- Replace byte-based brace counting with UTF-8 aware rune iteration
- Add proper string state tracking to handle escaped quotes
- Add comprehensive test cases for Chinese character handling
- Fix parsing errors when JSON contains Chinese text like 2048经典
2025-06-10 16:09:50 +08:00
lilong.129
88ae8faee1 feat: enhance VLM response parsing and DOUBAO model support
- Fix JSON extraction logic by prioritizing brace counting method
- Add support for DOUBAO string array coordinate format
- Introduce IS_UI_TARS helper function for model type checking
- Add comprehensive tests for JSON parsing and coordinate handling
- Improve error handling with retry delays for LLM service failures
2025-06-10 15:56:13 +08:00
lilong.129
4959c2e47e feat: extractJSONFromContent 2025-06-10 14:08:44 +08:00
lilong.129
7dc0f869be fix: extracts JSON content from various formats in the response 2025-06-10 14:02:41 +08:00
lilong.129
90401eeb78 change: remove unnecessary logs 2025-06-10 13:19:36 +08:00
lilong.129
12cebef3b9 change: set llm timeout to 120s 2025-06-09 22:42:19 +08:00
lilong.129
39acadb0a7 feat: add MCP tools registration to LLM service
- Add RegisterTools method to ILLMService interface
- Create shared MCP to eino tool converter
- Auto-register built-in uixt tools in XTDriver initialization
- Refactor MCPHost to use shared converter
- Add comprehensive test coverage for tool conversion

This enables doubao-1.5-thinking-vision-pro model to access
MCP tools through function calling mechanism.
2025-06-09 22:19:43 +08:00
lilong.129
96da4515a1 feat: optimize test report UI and add LLM usage tracking 2025-06-09 17:04:55 +08:00
lilong.129
cf360c8c46 feat: compress image data for html report 2025-06-08 23:48:23 +08:00
lilong.129
14cef72f5a feat: add model name display in AI actions and optimize HTML report
- Add ModelName field to PlanningResult and SubActionResult
- Update HTML report with improved layout and model name display
- Fix elapsed time setting bug and enhance mobile responsiveness
2025-06-08 22:08:51 +08:00
lilong.129
460570f651 fix(uixt): fix uixt__input not working and add comprehensive unit tests
- Fix parameter mapping issue where AI model's 'content' parameter wasn't mapped to 'text' field
- Add mapParameterName function to handle parameter name mapping (content->text, key->keycode)
- Add comprehensive unit tests for convertProcessedArgs and mapParameterName functions
- Update existing test cases to match new parameter format (x,y for single coords, from_x,from_y,to_x,to_y for drag)

This resolves the issue where uixt__input action was not working due to parameter name mismatch.
2025-06-07 15:03:29 +08:00
lilong.129
484eebdefd feat: implement multi-model service configuration support
- Support configuring multiple LLM services simultaneously
- Auto-derive model names from service types to simplify configuration
- Maintain backward compatibility with existing configurations
- Refactor configuration logic into dedicated env module
- Add comprehensive unit test coverage
- Update documentation with new configuration approach
2025-06-06 22:17:59 +08:00
lilong.129
b642ea004e feat: implement UI automation test history isolation
- Add ResetHistory option to PlanningOptions and ActionOptions
- Implement task completion detection with isTaskFinished() method
- Add executeActions() method to separate action execution logic
- Modify ConversationHistory.Clear() to completely clear all messages including system message
- Refactor StartToGoal() to automatically reset history on first attempt
- Add WithResetHistory() option function for consistent API
- Consolidate test files into driver_ext_ai_test.go with comprehensive test coverage
2025-06-06 15:29:42 +08:00
lilong.129
5f400735fc fix: 修复 StartToGoal 命令无法通过 CTRL+C 中断的问题
- 为 AI 相关方法添加 context.Context 参数支持中断

- 在重试循环中添加上下文取消检查

- 创建可取消的上下文并监听中断信号

- 更新 MCP 工具调用使用带上下文的方法

现在用户可以通过 CTRL+C 正常中断长时间运行的 AI 自动化任务
2025-06-05 20:00:20 +08:00
lilong.129
d883aa6a21 change: rename VLM name 2025-06-05 18:09:25 +08:00
lilong.129
8cdc71d90b change: RoundToOneDecimal 2025-06-05 17:47:29 +08:00
lilong.129
0add3231ff refactor: merge ActionSummary and Thought fields to eliminate duplication
- Remove redundant ActionSummary field from PlanningResult struct
- Update parsers to use unified Thought field instead of duplicate fields
- Modify chat interface to display Thought instead of ActionSummary
- Update planner logging to use thought instead of summary
- Adjust prompt templates to use thought field consistently
- Switch test LLM service from UI-TARS to DoubaoVL
- Add default parameter handling for sleep tool
2025-06-05 14:19:09 +08:00
lilong.129
0864f74021 fix: update AI parser to use doubao-1.5-thinking-vision-pro configuration 2025-06-05 13:28:31 +08:00
lilong.129
c204542f1f feat: optimize UI-TARS parser with coordinate conversion and action mapping
- Add action mapping for UI-TARS parser to convert action names to option.ActionName
- Implement bounding box to center point coordinate conversion for better accuracy
- Update coordinate normalization to handle coordinates > 1000 properly
- Enhance test cases to verify coordinate scaling and center point conversion
- Improve action argument processing with proper coordinate transformation
- Add comprehensive test coverage for coordinate conversion edge cases

Key improvements:
- Bounding box [x1,y1,x2,y2] now converts to center point [cx,cy] for actions
- Coordinate scaling properly handles different screen resolutions
- Action names are mapped through doubao_1_5_ui_tars_action_mapping
- Enhanced error handling for invalid coordinate formats
2025-06-04 23:16:14 +08:00
lilong.129
2a392f204b docs: add comprehensive AI module documentation
- Add detailed documentation for HttpRunner AI module
- Cover planning, assertion, computer vision, and session management
- Include architecture design, usage guide, and configuration
- Provide code examples and best practices
- Document all core components and interfaces
2025-05-30 00:45:44 +08:00
lilong.129
f20fdd51bc feat: Validate model type and model name compatibility 2025-05-26 09:40:28 +08:00
lilong.129
4e74247cab fix: miss tool call ID 2025-05-26 09:28:46 +08:00
lilong.129
014140ccc7 change: append tool call message for planner 2025-05-24 10:28:55 +08:00
lilong.129
b639b4473f test: update unittests 2025-05-24 01:00:30 +08:00
lilong.129
81c854f963 refactor: merge ai parser 2025-05-24 00:25:44 +08:00
lilong.129
19ddcb40cc change: update ui-tars prompt 2025-05-23 22:05:21 +08:00
lilong.129
009bfa4ecb refactor: replace ui-tars parser with https://github.com/bytedance/UI-TARS/blob/main/codes/ui_tars/action_parser.py 2025-05-22 22:52:47 +08:00
lilong.129
3b77ade24f refactor: json asserter 2025-05-22 18:22:12 +08:00
lilong.129
c377664518 refactor: add LLMServiceTypeDoubaoVL 2025-05-22 15:34:11 +08:00
lilong.129
bb592548b4 fix: chat with screenshot 2025-05-21 22:35:16 +08:00
lilong.129
037e69315e change: remove unused code 2025-05-20 18:03:54 +08:00
lilong.129
b2ab14efcc refactor: rename to AssertionResult 2025-05-19 11:51:49 +08:00
lilong.129
3f1ee03529 refactor: mcphost planner 2025-05-18 21:55:01 +08:00
lilong.129
6569121d5d refactor: move LoadImage 2025-04-30 16:21:01 +08:00
lilong.129
fcddcfb630 refactor: GetModelConfig 2025-04-30 15:21:17 +08:00
lilong.129
0e9389c796 refactor: NewXTDriver api, return error if init failed 2025-04-30 14:31:36 +08:00
lilong.129
2ae252b52a refactor: merge planner 2025-04-30 14:07:48 +08:00
lilong.129
cc9a527274 refactor: select model type by env LLM_MODEL_USE 2025-04-29 23:14:12 +08:00
lilong.129
3ffa5d96d2 refactor: config llm env 2025-04-29 22:33:18 +08:00
lilong.129
429bfe3986 feat: assert with openai model 2025-04-29 22:03:11 +08:00
lilong.129
4d7c7e8aaf refactor: ai asserter 2025-04-29 20:08:22 +08:00
lilong.129
14e353a572 fix: test failed by prompt 2025-04-29 12:10:16 +08:00
lilong.129
7132eec39e feat: add status code for llm 2025-04-28 21:06:53 +08:00
lilong.129
68dbeb368a refactor: adds a message to the conversation history 2025-04-28 20:12:08 +08:00
lilong.129
427cc1dab2 fix: potential file inclusion via variable 2025-04-28 19:59:21 +08:00
lilong.129
4d7dc466f3 feat: set LLMService/CVService for case runner 2025-04-28 19:45:07 +08:00
lilong.129
7fa4155390 refactor: move code 2025-04-27 22:37:48 +08:00
lilong.129
9bcdd5d19a feat: add AIAsert 2025-04-27 22:25:06 +08:00
lilong.129
84ff75c3b1 change: add tests 2025-04-27 19:13:55 +08:00
lilong.129
817dc4d6a5 change: set Temperature for ark model 2025-04-25 17:27:37 +08:00