16 Commits

Author SHA1 Message Date
lilong.129
3d2707fa36 feat: add home and back action mappings to planner prompts 2025-06-18 12:01:53 +08:00
lilong.129
98bd41ff33 fix: add direction parameter support for scroll operations in UI-TARS parser
- Handle direction parameter in convertProcessedArgs for scroll actions
- Ensure scroll operations map to swipe with both coordinates and direction
- Add comprehensive test coverage for scroll action parsing
- Fix issue where scroll direction was missing from tool call arguments
2025-06-10 16:40:10 +08:00
lilong.129
cf360c8c46 feat: compress image data for html report 2025-06-08 23:48:23 +08:00
lilong.129
0add3231ff refactor: merge ActionSummary and Thought fields to eliminate duplication
- Remove redundant ActionSummary field from PlanningResult struct
- Update parsers to use unified Thought field instead of duplicate fields
- Modify chat interface to display Thought instead of ActionSummary
- Update planner logging to use thought instead of summary
- Adjust prompt templates to use thought field consistently
- Switch test LLM service from UI-TARS to DoubaoVL
- Add default parameter handling for sleep tool
2025-06-05 14:19:09 +08:00
lilong.129
0864f74021 fix: update AI parser to use doubao-1.5-thinking-vision-pro configuration 2025-06-05 13:28:31 +08:00
lilong.129
c204542f1f feat: optimize UI-TARS parser with coordinate conversion and action mapping
- Add action mapping for UI-TARS parser to convert action names to option.ActionName
- Implement bounding box to center point coordinate conversion for better accuracy
- Update coordinate normalization to handle coordinates > 1000 properly
- Enhance test cases to verify coordinate scaling and center point conversion
- Improve action argument processing with proper coordinate transformation
- Add comprehensive test coverage for coordinate conversion edge cases

Key improvements:
- Bounding box [x1,y1,x2,y2] now converts to center point [cx,cy] for actions
- Coordinate scaling properly handles different screen resolutions
- Action names are mapped through doubao_1_5_ui_tars_action_mapping
- Enhanced error handling for invalid coordinate formats
2025-06-04 23:16:14 +08:00
lilong.129
b639b4473f test: update unittests 2025-05-24 01:00:30 +08:00
lilong.129
81c854f963 refactor: merge ai parser 2025-05-24 00:25:44 +08:00
lilong.129
19ddcb40cc change: update ui-tars prompt 2025-05-23 22:05:21 +08:00
lilong.129
009bfa4ecb refactor: replace ui-tars parser with https://github.com/bytedance/UI-TARS/blob/main/codes/ui_tars/action_parser.py 2025-05-22 22:52:47 +08:00
lilong.129
3b77ade24f refactor: json asserter 2025-05-22 18:22:12 +08:00
lilong.129
c377664518 refactor: add LLMServiceTypeDoubaoVL 2025-05-22 15:34:11 +08:00
lilong.129
037e69315e change: remove unused code 2025-05-20 18:03:54 +08:00
lilong.129
3f1ee03529 refactor: mcphost planner 2025-05-18 21:55:01 +08:00
lilong.129
2ae252b52a refactor: merge planner 2025-04-30 14:07:48 +08:00
lilong.129
4d7c7e8aaf refactor: ai asserter 2025-04-29 20:08:22 +08:00