Commit Graph

26 Commits

Author SHA1 Message Date
lilong.129
c4e7ab00a7 feat: implement ToolStartToGoal and fix LLM service initialization
- Add ToolStartToGoal implementation with AI-driven goal automation
- Fix LLM service not initialized issue by applying global AI config to XTDriver creation
- Ensure XTDriver is created with proper AI services from the first initialization
- Add StartToGoal method to StepMobile for goal-oriented automation
- Register ToolStartToGoal in MCP server and add corresponding action type
- Add comprehensive test case for StartToGoal functionality
- Fix ReturnSchema consistency across AI tools (StartToGoal, AIAction, Finished)
- Extract AI service options in MCP argument processing

This resolves the root cause where XTDriver was created without AI services
in runStepMobileUI, ensuring only one XTDriver initialization with complete
AI service configuration.
2025-06-05 16:52:11 +08:00
lilong.129
c204542f1f feat: optimize UI-TARS parser with coordinate conversion and action mapping
- Add action mapping for UI-TARS parser to convert action names to option.ActionName
- Implement bounding box to center point coordinate conversion for better accuracy
- Update coordinate normalization to handle coordinates > 1000 properly
- Enhance test cases to verify coordinate scaling and center point conversion
- Improve action argument processing with proper coordinate transformation
- Add comprehensive test coverage for coordinate conversion edge cases

Key improvements:
- Bounding box [x1,y1,x2,y2] now converts to center point [cx,cy] for actions
- Coordinate scaling properly handles different screen resolutions
- Action names are mapped through doubao_1_5_ui_tars_action_mapping
- Enhanced error handling for invalid coordinate formats
2025-06-04 23:16:14 +08:00
lilong.129
bd8cb5abf4 refactor: move MobileAction to option package and update imports
- Move MobileAction struct from uixt package to uixt/option package
- Delete uixt/driver_action.go file as MobileAction is now in option package
- Update all import statements across the codebase to use option.MobileAction
- Update ActionTool interface to use option.MobileAction in ConvertActionToCallToolRequest method
- Maintain backward compatibility while improving package organization
- Clean up code structure by consolidating action-related types in option package

Files affected:
- server/uixt.go: Updated imports and type references
- step.go: Updated imports and ActionResult struct
- step_ui.go: Updated all MobileAction references to option.MobileAction
- uixt/mcp_server.go: Updated ActionTool interface and removed detailed comments
- uixt/mcp_server_test.go: Updated all test cases to use option.MobileAction
- uixt/mcp_tools_*.go: Updated ConvertActionToCallToolRequest method signatures
- uixt/option/action.go: Added MobileAction struct definition
- uixt/sdk.go: Updated ExecuteAction method signature
2025-06-03 18:15:28 +08:00
lilong.129
2fe5b14d63 refactor: integrate and optimize MCP tool calling methods 2025-05-27 21:39:17 +08:00
lilong.129
866cc0e4d2 feat: implement MCP hooks integration with anti_risk option 2025-05-27 19:46:08 +08:00
lilong.129
404865ba6b refactor: complete ActionOptions unification and pointer type optimization 2025-05-27 13:34:12 +08:00
lilong.129
7fb966b7ba refactor: improve ActionMethod type safety and eliminate type conversions 2025-05-27 11:49:30 +08:00
lilong.129
6ae4c300c1 add generic swipe tool with auto-detection of direction vs coordinate params
- Added ACTION_Swipe to option/action.go for generic swipe functionality
- Implemented ToolSwipe in mcp_server.go that automatically detects parameter type:
  - String params (up/down/left/right) use direction-based swipe logic
  - Array params [fromX, fromY, toX, toY] use coordinate-based swipe logic
- Added comprehensive test coverage for ToolSwipe in mcp_server_test.go
- Updated tool registration to include the new generic swipe tool
- All tests pass, confirming backward compatibility with existing tools
2025-05-26 22:39:23 +08:00
lilong.129
9a5e0849de fix: handle GetOrCreateXTDriver when serial is empty 2025-05-26 21:25:25 +08:00
lilong.129
36c5044402 feat: add mcp tool finished 2025-05-26 09:05:48 +08:00
lilong.129
778344c826 change: remove call function tool 2025-05-26 00:43:01 +08:00
lilong.129
2e17d9df16 refactor: merge DoAction to mcp server tools 2025-05-25 23:53:07 +08:00
lilong.129
4ff2692f02 refactor: move action options 2025-05-25 00:15:18 +08:00
lilong.129
d145784910 fix: swipe with params 2025-05-14 14:36:46 +08:00
lilong.129
d95eec78b0 feat: add WithPreMarkOperation and WithPostMarkOperation to mark UI operation before/after action 2025-05-12 08:58:27 +08:00
lilong.129
9bafea53af feat: support action options for AppLaunch/AppTerminate 2025-05-10 00:01:30 +08:00
lilong.129
3715cbb432 feat: support pre hook and post hook for actions 2025-05-09 23:01:27 +08:00
lilong.129
cfc71819d2 feat: mark tap/swipe UI operation 2025-05-05 16:31:13 +08:00
lilong.129
d2976844fc fix: load testcase panic caused by config options 2025-04-27 11:50:50 +08:00
lilong.129
182de16751 feat: ApplySwipeOffset 2025-03-17 17:57:10 +08:00
lilong.129
9fb53590ca refactor: rename ApplyTapOffset 2025-03-17 17:44:04 +08:00
lilong.129
b34a2218fe feat: tap random point in ocr text rect 2025-03-17 15:36:35 +08:00
lilong.129
3e7e9b0ef9 change: tap/swipe with offset 2025-03-17 14:36:09 +08:00
lilong.129
0d416e74a1 change: use context to contral ScreenRecord timeout or cancel 2025-03-06 22:16:49 +08:00
lilong.129
cc81c00a82 feat: add adb screen record 2025-03-06 16:57:51 +08:00
lilong.129
e107389d6e refactor: move uixt pkg 2025-03-05 11:04:02 +08:00