Commit Graph

37 Commits

Author SHA1 Message Date
lilong.129
c204542f1f feat: optimize UI-TARS parser with coordinate conversion and action mapping
- Add action mapping for UI-TARS parser to convert action names to option.ActionName
- Implement bounding box to center point coordinate conversion for better accuracy
- Update coordinate normalization to handle coordinates > 1000 properly
- Enhance test cases to verify coordinate scaling and center point conversion
- Improve action argument processing with proper coordinate transformation
- Add comprehensive test coverage for coordinate conversion edge cases

Key improvements:
- Bounding box [x1,y1,x2,y2] now converts to center point [cx,cy] for actions
- Coordinate scaling properly handles different screen resolutions
- Action names are mapped through doubao_1_5_ui_tars_action_mapping
- Enhanced error handling for invalid coordinate formats
2025-06-04 23:16:14 +08:00
lilong.129
bd8cb5abf4 refactor: move MobileAction to option package and update imports
- Move MobileAction struct from uixt package to uixt/option package
- Delete uixt/driver_action.go file as MobileAction is now in option package
- Update all import statements across the codebase to use option.MobileAction
- Update ActionTool interface to use option.MobileAction in ConvertActionToCallToolRequest method
- Maintain backward compatibility while improving package organization
- Clean up code structure by consolidating action-related types in option package

Files affected:
- server/uixt.go: Updated imports and type references
- step.go: Updated imports and ActionResult struct
- step_ui.go: Updated all MobileAction references to option.MobileAction
- uixt/mcp_server.go: Updated ActionTool interface and removed detailed comments
- uixt/mcp_server_test.go: Updated all test cases to use option.MobileAction
- uixt/mcp_tools_*.go: Updated ConvertActionToCallToolRequest method signatures
- uixt/option/action.go: Added MobileAction struct definition
- uixt/sdk.go: Updated ExecuteAction method signature
2025-06-03 18:15:28 +08:00
lilong.129
2fe5b14d63 refactor: integrate and optimize MCP tool calling methods 2025-05-27 21:39:17 +08:00
lilong.129
866cc0e4d2 feat: implement MCP hooks integration with anti_risk option 2025-05-27 19:46:08 +08:00
lilong.129
404865ba6b refactor: complete ActionOptions unification and pointer type optimization 2025-05-27 13:34:12 +08:00
lilong.129
7fb966b7ba refactor: improve ActionMethod type safety and eliminate type conversions 2025-05-27 11:49:30 +08:00
lilong.129
466fe39cb9 docs: add comprehensive migration summary for ActionOptions and Request integration
- Document the complete integration process of ActionOptions and Request structures
- Include detailed statistics: 40 tools migrated with 100% test pass rate
- Provide technical implementation details and usage examples
- Record backward compatibility guarantees and migration helpers
- Summarize code quality improvements and performance optimizations
- Outline future development plans and goals

This documentation serves as a complete record of the unification initiative
and provides guidance for future development and maintenance.
2025-05-26 23:13:19 +08:00
lilong.129
6ae4c300c1 add generic swipe tool with auto-detection of direction vs coordinate params
- Added ACTION_Swipe to option/action.go for generic swipe functionality
- Implemented ToolSwipe in mcp_server.go that automatically detects parameter type:
  - String params (up/down/left/right) use direction-based swipe logic
  - Array params [fromX, fromY, toX, toY] use coordinate-based swipe logic
- Added comprehensive test coverage for ToolSwipe in mcp_server_test.go
- Updated tool registration to include the new generic swipe tool
- All tests pass, confirming backward compatibility with existing tools
2025-05-26 22:39:23 +08:00
lilong.129
77f5683f9a fix: remove unnecessary IgnoreNotFoundError and MaxRetryTimes from coordinate-based tap tools
- Removed IgnoreNotFoundError and MaxRetryTimes parameters from TapRequest, TapAbsXYRequest, and DoubleTapXYRequest structures
- Updated corresponding tool implementations to remove references to these non-existent fields
- These parameters are not applicable to coordinate-based operations as they don't involve element searching
- Only OCR/CV-based operations need these error handling parameters

This ensures that only relevant tools have the ignore_NotFoundError functionality,
making the API more consistent and avoiding confusion.
2025-05-26 22:10:08 +08:00
lilong.129
df65f9a828 fix: MCP server ignore_NotFoundError option not working
- Fixed TapByOCR and TapByCV tools to properly handle ignore_NotFoundError option
- Added option parameters to all MCP tool request structures
- Fixed ConvertActionToCallToolRequest methods to extract action options
- Added extractActionOptionsToArguments helper function for consistent option handling
- Extended fix to all MCP tools: SwipeToTapApp, SwipeToTapText, SwipeToTapTexts, TapXY, TapAbsXY
- Added comprehensive tests for option parameter handling
- Updated test expectations to match actual registered tools

This ensures that when ignore_NotFoundError is set to true, OCR/CV operations
will return nil instead of throwing errors when target elements are not found,
allowing tests to continue execution as expected.
2025-05-26 22:02:01 +08:00
lilong.129
9a5e0849de fix: handle GetOrCreateXTDriver when serial is empty 2025-05-26 21:25:25 +08:00
lilong.129
2569670c7f feat: implement unified XTDriver cache 2025-05-26 19:39:46 +08:00
lilong.129
36c5044402 feat: add mcp tool finished 2025-05-26 09:05:48 +08:00
lilong.129
778344c826 change: remove call function tool 2025-05-26 00:43:01 +08:00
lilong.129
2e17d9df16 refactor: merge DoAction to mcp server tools 2025-05-25 23:53:07 +08:00
lilong.129
7986c4899f refactor: move DoAction to MCP tools call 2025-05-25 08:10:57 +08:00
lilong.129
4ff2692f02 refactor: move action options 2025-05-25 00:15:18 +08:00
lilong.129
97dad38b7b refactor: move tool request types to option 2025-05-24 23:51:58 +08:00
lilong.129
c377664518 refactor: add LLMServiceTypeDoubaoVL 2025-05-22 15:34:11 +08:00
lilong.129
d145784910 fix: swipe with params 2025-05-14 14:36:46 +08:00
lilong.129
d95eec78b0 feat: add WithPreMarkOperation and WithPostMarkOperation to mark UI operation before/after action 2025-05-12 08:58:27 +08:00
lilong.129
9bafea53af feat: support action options for AppLaunch/AppTerminate 2025-05-10 00:01:30 +08:00
lilong.129
3715cbb432 feat: support pre hook and post hook for actions 2025-05-09 23:01:27 +08:00
徐聪
6cce5e3c5b fix: web ui test 2025-05-07 20:12:06 +08:00
lilong.129
cfc71819d2 feat: mark tap/swipe UI operation 2025-05-05 16:31:13 +08:00
lilong.129
0e9389c796 refactor: NewXTDriver api, return error if init failed 2025-04-30 14:31:36 +08:00
lilong.129
d2976844fc fix: load testcase panic caused by config options 2025-04-27 11:50:50 +08:00
徐聪
382aad2d9f fix: 修复浏览器驱动的一些问题 2025-04-24 22:57:08 +08:00
lilong.129
182de16751 feat: ApplySwipeOffset 2025-03-17 17:57:10 +08:00
lilong.129
9fb53590ca refactor: rename ApplyTapOffset 2025-03-17 17:44:04 +08:00
lilong.129
b34a2218fe feat: tap random point in ocr text rect 2025-03-17 15:36:35 +08:00
lilong.129
3e7e9b0ef9 change: tap/swipe with offset 2025-03-17 14:36:09 +08:00
lilong.129
0d416e74a1 change: use context to contral ScreenRecord timeout or cancel 2025-03-06 22:16:49 +08:00
lilong.129
79e0323471 fix: screen record with scrcpy 2025-03-06 17:50:01 +08:00
lilong.129
cc81c00a82 feat: add adb screen record 2025-03-06 16:57:51 +08:00
lilong.129
b5fffdf548 move ghdc to pkg 2025-03-05 21:33:06 +08:00
lilong.129
e107389d6e refactor: move uixt pkg 2025-03-05 11:04:02 +08:00