lilong.129
5f400735fc
fix: 修复 StartToGoal 命令无法通过 CTRL+C 中断的问题
...
- 为 AI 相关方法添加 context.Context 参数支持中断
- 在重试循环中添加上下文取消检查
- 创建可取消的上下文并监听中断信号
- 更新 MCP 工具调用使用带上下文的方法
现在用户可以通过 CTRL+C 正常中断长时间运行的 AI 自动化任务
2025-06-05 20:00:20 +08:00
lilong.129
d883aa6a21
change: rename VLM name
2025-06-05 18:09:25 +08:00
lilong.129
8cdc71d90b
change: RoundToOneDecimal
2025-06-05 17:47:29 +08:00
lilong.129
0add3231ff
refactor: merge ActionSummary and Thought fields to eliminate duplication
...
- Remove redundant ActionSummary field from PlanningResult struct
- Update parsers to use unified Thought field instead of duplicate fields
- Modify chat interface to display Thought instead of ActionSummary
- Update planner logging to use thought instead of summary
- Adjust prompt templates to use thought field consistently
- Switch test LLM service from UI-TARS to DoubaoVL
- Add default parameter handling for sleep tool
2025-06-05 14:19:09 +08:00
lilong.129
0864f74021
fix: update AI parser to use doubao-1.5-thinking-vision-pro configuration
2025-06-05 13:28:31 +08:00
lilong.129
c204542f1f
feat: optimize UI-TARS parser with coordinate conversion and action mapping
...
- Add action mapping for UI-TARS parser to convert action names to option.ActionName
- Implement bounding box to center point coordinate conversion for better accuracy
- Update coordinate normalization to handle coordinates > 1000 properly
- Enhance test cases to verify coordinate scaling and center point conversion
- Improve action argument processing with proper coordinate transformation
- Add comprehensive test coverage for coordinate conversion edge cases
Key improvements:
- Bounding box [x1,y1,x2,y2] now converts to center point [cx,cy] for actions
- Coordinate scaling properly handles different screen resolutions
- Action names are mapped through doubao_1_5_ui_tars_action_mapping
- Enhanced error handling for invalid coordinate formats
2025-06-04 23:16:14 +08:00
lilong.129
2a392f204b
docs: add comprehensive AI module documentation
...
- Add detailed documentation for HttpRunner AI module
- Cover planning, assertion, computer vision, and session management
- Include architecture design, usage guide, and configuration
- Provide code examples and best practices
- Document all core components and interfaces
2025-05-30 00:45:44 +08:00
lilong.129
f20fdd51bc
feat: Validate model type and model name compatibility
2025-05-26 09:40:28 +08:00
lilong.129
4e74247cab
fix: miss tool call ID
2025-05-26 09:28:46 +08:00
lilong.129
014140ccc7
change: append tool call message for planner
2025-05-24 10:28:55 +08:00
lilong.129
b639b4473f
test: update unittests
2025-05-24 01:00:30 +08:00
lilong.129
81c854f963
refactor: merge ai parser
2025-05-24 00:25:44 +08:00
lilong.129
19ddcb40cc
change: update ui-tars prompt
2025-05-23 22:05:21 +08:00
lilong.129
009bfa4ecb
refactor: replace ui-tars parser with https://github.com/bytedance/UI-TARS/blob/main/codes/ui_tars/action_parser.py
2025-05-22 22:52:47 +08:00
lilong.129
3b77ade24f
refactor: json asserter
2025-05-22 18:22:12 +08:00
lilong.129
c377664518
refactor: add LLMServiceTypeDoubaoVL
2025-05-22 15:34:11 +08:00
lilong.129
bb592548b4
fix: chat with screenshot
2025-05-21 22:35:16 +08:00
lilong.129
037e69315e
change: remove unused code
2025-05-20 18:03:54 +08:00
lilong.129
b2ab14efcc
refactor: rename to AssertionResult
2025-05-19 11:51:49 +08:00
lilong.129
3f1ee03529
refactor: mcphost planner
2025-05-18 21:55:01 +08:00
lilong.129
6569121d5d
refactor: move LoadImage
2025-04-30 16:21:01 +08:00
lilong.129
fcddcfb630
refactor: GetModelConfig
2025-04-30 15:21:17 +08:00
lilong.129
0e9389c796
refactor: NewXTDriver api, return error if init failed
2025-04-30 14:31:36 +08:00
lilong.129
2ae252b52a
refactor: merge planner
2025-04-30 14:07:48 +08:00
lilong.129
cc9a527274
refactor: select model type by env LLM_MODEL_USE
2025-04-29 23:14:12 +08:00
lilong.129
3ffa5d96d2
refactor: config llm env
2025-04-29 22:33:18 +08:00
lilong.129
429bfe3986
feat: assert with openai model
2025-04-29 22:03:11 +08:00
lilong.129
4d7c7e8aaf
refactor: ai asserter
2025-04-29 20:08:22 +08:00
lilong.129
14e353a572
fix: test failed by prompt
2025-04-29 12:10:16 +08:00
lilong.129
7132eec39e
feat: add status code for llm
2025-04-28 21:06:53 +08:00
lilong.129
68dbeb368a
refactor: adds a message to the conversation history
2025-04-28 20:12:08 +08:00
lilong.129
427cc1dab2
fix: potential file inclusion via variable
2025-04-28 19:59:21 +08:00
lilong.129
4d7dc466f3
feat: set LLMService/CVService for case runner
2025-04-28 19:45:07 +08:00
lilong.129
7fa4155390
refactor: move code
2025-04-27 22:37:48 +08:00
lilong.129
9bcdd5d19a
feat: add AIAsert
2025-04-27 22:25:06 +08:00
lilong.129
84ff75c3b1
change: add tests
2025-04-27 19:13:55 +08:00
lilong.129
817dc4d6a5
change: set Temperature for ark model
2025-04-25 17:27:37 +08:00
lilong.129
70a8ee01f7
refactor: llm planner
2025-04-21 21:33:30 +08:00
lilong.129
ebeae596a7
stash
2025-04-21 14:39:37 +08:00
lilong.129
563015c55a
refactor: LoadEnv
2025-03-31 14:54:58 +08:00
lilong.129
561560accb
fix: avoid implicit memory aliasing in for loop
2025-03-31 10:54:59 +08:00
lilong.129
2ad5c4f6db
fix: load env
2025-03-23 10:06:50 +08:00
lilong.129
148d70accf
change: load env once
2025-03-22 15:23:39 +08:00
lilong.129
f46fcfb456
fix: parse result for finished type
2025-03-22 01:19:23 +08:00
lilong.129
12e0f7f9a2
feat: save screenshots for PlanNextAction
2025-03-22 01:07:28 +08:00
lilong.129
8a3b6b5c4c
feat: appendConversationHistory for ai planner
2025-03-22 00:06:30 +08:00
lilong.129
868acd45ac
fix: load jpeg image
2025-03-20 20:39:32 +08:00
lilong.129
da0bdc4fe5
fix: convertCoordinateAction
2025-03-20 18:02:35 +08:00
lilong.129
3801ffb744
feat: load .env file from current working directory upward recursively
2025-03-20 14:23:56 +08:00
lilong.129
b5f3e7ff96
change: remove unused code
2025-03-19 22:47:10 +08:00