Commit Graph

337 Commits

Author SHA1 Message Date
lilong.129
5f7698c6b4 fix: improve Chinese character display in HTML reports
- Fix JSON serialization to preserve Chinese characters instead of Unicode escaping
- Use SetEscapeHTML(false) in toJSON template function
- Apply safeHTML to prevent HTML entity encoding of Chinese text
- Now displays {"text":"连了又连"} instead of {"text":"连了又连"}
2025-06-08 09:29:41 +08:00
lilong.129
4053cc9985 feat: add comprehensive HTML report generation with log filtering
- Add complete HTML report generator with template-based rendering
- Implement log time filtering for step-specific logs
- Support responsive design and interactive UI features
- Consolidate duplicate report implementations
2025-06-08 09:23:14 +08:00
lilong.129
ec4f1eb68a refactor: unify action execution interface and merge AI action handling 2025-06-07 23:59:07 +08:00
lilong.129
fcf3009c67 fix: abnormal indent in summary.json 2025-06-07 20:45:35 +08:00
lilong.129
e75edf8400 feat: add log file output to results/taskID directory 2025-06-07 16:52:41 +08:00
lilong.129
604eed3340 refactor: optimize runner error handling and cleanup logic
- Use defer for summary saving and HTML report generation to ensure they run regardless of exit path
- Remove unnecessary sync.Once for cleanup operations since defer guarantees single execution
- Simplify error handling logic by removing redundant runErr checks
- Improve interrupt handling with better logging messages
- Ensure graceful cleanup and data persistence even when interrupted
2025-06-07 16:36:53 +08:00
lilong.129
460570f651 fix(uixt): fix uixt__input not working and add comprehensive unit tests
- Fix parameter mapping issue where AI model's 'content' parameter wasn't mapped to 'text' field
- Add mapParameterName function to handle parameter name mapping (content->text, key->keycode)
- Add comprehensive unit tests for convertProcessedArgs and mapParameterName functions
- Update existing test cases to match new parameter format (x,y for single coords, from_x,from_y,to_x,to_y for drag)

This resolves the issue where uixt__input action was not working due to parameter name mismatch.
2025-06-07 15:03:29 +08:00
lilong.129
334c0dc141 fix: 修复移动端步骤包含 validate 时验证器不执行的问题 2025-06-06 22:18:43 +08:00
lilong.129
484eebdefd feat: implement multi-model service configuration support
- Support configuring multiple LLM services simultaneously
- Auto-derive model names from service types to simplify configuration
- Maintain backward compatibility with existing configurations
- Refactor configuration logic into dedicated env module
- Add comprehensive unit test coverage
- Update documentation with new configuration approach
2025-06-06 22:17:59 +08:00
lilong.129
b642ea004e feat: implement UI automation test history isolation
- Add ResetHistory option to PlanningOptions and ActionOptions
- Implement task completion detection with isTaskFinished() method
- Add executeActions() method to separate action execution logic
- Modify ConversationHistory.Clear() to completely clear all messages including system message
- Refactor StartToGoal() to automatically reset history on first attempt
- Add WithResetHistory() option function for consistent API
- Consolidate test files into driver_ext_ai_test.go with comprehensive test coverage
2025-06-06 15:29:42 +08:00
lilong.129
6e1bd5bbe2 feat: optimize MCP tools response format with automatic schema generation
- Remove all manual ReturnSchema() methods from tools
- Implement automatic schema generation using reflection
- Unify response format to flat structure with action/success/message fields
- Simplify tool implementation by removing MCPResponse embedding
- Update documentation to reflect new architecture
- Achieve ~70% code reduction while maintaining type safety
2025-06-05 23:17:06 +08:00
lilong.129
56831845ca change: fix logs 2025-06-05 20:26:18 +08:00
lilong.129
5f400735fc fix: 修复 StartToGoal 命令无法通过 CTRL+C 中断的问题
- 为 AI 相关方法添加 context.Context 参数支持中断

- 在重试循环中添加上下文取消检查

- 创建可取消的上下文并监听中断信号

- 更新 MCP 工具调用使用带上下文的方法

现在用户可以通过 CTRL+C 正常中断长时间运行的 AI 自动化任务
2025-06-05 20:00:20 +08:00
lilong.129
d883aa6a21 change: rename VLM name 2025-06-05 18:09:25 +08:00
lilong.129
8cdc71d90b change: RoundToOneDecimal 2025-06-05 17:47:29 +08:00
lilong.129
c4e7ab00a7 feat: implement ToolStartToGoal and fix LLM service initialization
- Add ToolStartToGoal implementation with AI-driven goal automation
- Fix LLM service not initialized issue by applying global AI config to XTDriver creation
- Ensure XTDriver is created with proper AI services from the first initialization
- Add StartToGoal method to StepMobile for goal-oriented automation
- Register ToolStartToGoal in MCP server and add corresponding action type
- Add comprehensive test case for StartToGoal functionality
- Fix ReturnSchema consistency across AI tools (StartToGoal, AIAction, Finished)
- Extract AI service options in MCP argument processing

This resolves the root cause where XTDriver was created without AI services
in runStepMobileUI, ensuring only one XTDriver initialization with complete
AI service configuration.
2025-06-05 16:52:11 +08:00
lilong.129
0add3231ff refactor: merge ActionSummary and Thought fields to eliminate duplication
- Remove redundant ActionSummary field from PlanningResult struct
- Update parsers to use unified Thought field instead of duplicate fields
- Modify chat interface to display Thought instead of ActionSummary
- Update planner logging to use thought instead of summary
- Adjust prompt templates to use thought field consistently
- Switch test LLM service from UI-TARS to DoubaoVL
- Add default parameter handling for sleep tool
2025-06-05 14:19:09 +08:00
lilong.129
0864f74021 fix: update AI parser to use doubao-1.5-thinking-vision-pro configuration 2025-06-05 13:28:31 +08:00
lilong.129
c204542f1f feat: optimize UI-TARS parser with coordinate conversion and action mapping
- Add action mapping for UI-TARS parser to convert action names to option.ActionName
- Implement bounding box to center point coordinate conversion for better accuracy
- Update coordinate normalization to handle coordinates > 1000 properly
- Enhance test cases to verify coordinate scaling and center point conversion
- Improve action argument processing with proper coordinate transformation
- Add comprehensive test coverage for coordinate conversion edge cases

Key improvements:
- Bounding box [x1,y1,x2,y2] now converts to center point [cx,cy] for actions
- Coordinate scaling properly handles different screen resolutions
- Action names are mapped through doubao_1_5_ui_tars_action_mapping
- Enhanced error handling for invalid coordinate formats
2025-06-04 23:16:14 +08:00
lilong.129
1df529ecaa merge master 2025-06-03 18:20:55 +08:00
lilong.129
798e66abe6 Merge branch 'master' into mcp-plugin
Resolve merge conflicts and integrate latest changes from master branch.
2025-06-03 18:18:47 +08:00
lilong.129
bd8cb5abf4 refactor: move MobileAction to option package and update imports
- Move MobileAction struct from uixt package to uixt/option package
- Delete uixt/driver_action.go file as MobileAction is now in option package
- Update all import statements across the codebase to use option.MobileAction
- Update ActionTool interface to use option.MobileAction in ConvertActionToCallToolRequest method
- Maintain backward compatibility while improving package organization
- Clean up code structure by consolidating action-related types in option package

Files affected:
- server/uixt.go: Updated imports and type references
- step.go: Updated imports and ActionResult struct
- step_ui.go: Updated all MobileAction references to option.MobileAction
- uixt/mcp_server.go: Updated ActionTool interface and removed detailed comments
- uixt/mcp_server_test.go: Updated all test cases to use option.MobileAction
- uixt/mcp_tools_*.go: Updated ConvertActionToCallToolRequest method signatures
- uixt/option/action.go: Added MobileAction struct definition
- uixt/sdk.go: Updated ExecuteAction method signature
2025-06-03 18:15:28 +08:00
lilong.129
1cc4b1cf5b refactor: modularize MCP server tools by functionality
- Split large mcp_server.go into modular files by functionality
- Create dedicated files for each tool category:
  - mcp_tools_device.go: Device management tools
  - mcp_tools_touch.go: Touch operation tools  
  - mcp_tools_swipe.go: Swipe and drag operation tools
  - mcp_tools_input.go: Input and IME tools
  - mcp_tools_button.go: Button operation tools
  - mcp_tools_app.go: Application management tools
  - mcp_tools_screen.go: Screen operation tools
  - mcp_tools_utility.go: Utility tools (sleep, popups)
  - mcp_tools_web.go: Web operation tools
  - mcp_tools_ai.go: AI-driven operation tools
- Update mcp_server.md documentation to reflect modular architecture
- Maintain pure ActionTool architecture with complete tool decoupling
- Improve code organization and maintainability
2025-06-03 15:45:42 +08:00
lilong.129
37028c4263 feat(mcphost): optimize shutdown logging to avoid false error messages
- Add identification for normal shutdown pipe errors in startStdioLog
- Optimize stdio log error handling logic to distinguish between normal shutdown and actual errors
- Add proper handling for SIGTERM (exit status 143) in isSignalError function
- Add debug logging for MCP config loading process
- Ensure clean shutdown without confusing error messages
2025-06-03 13:21:38 +08:00
lilong.129
184081592c feat: add global .env file support from ~/.hrp/.env
- Add support for loading environment variables from ~/.hrp/.env
- Implement priority order: current working directory > ~/.hrp/.env > system environment variables
- Use godotenv.Overload() for both global and local .env files to ensure proper priority
- Maintain backward compatibility with existing functionality
- Add comprehensive error handling and logging
2025-06-02 11:52:36 +08:00
lilong.129
9089bd9324 feat: 重构 MCP 工具导出逻辑并完善返回值类型系统 2025-05-31 00:28:24 +08:00
lilong.129
2a392f204b docs: add comprehensive AI module documentation
- Add detailed documentation for HttpRunner AI module
- Cover planning, assertion, computer vision, and session management
- Include architecture design, usage guide, and configuration
- Provide code examples and best practices
- Document all core components and interfaces
2025-05-30 00:45:44 +08:00
lilong.129
f702a3cc78 docs: add comprehensive documentation for MCP server
- Add detailed package-level documentation for mcp_server.go
- Create MCP_SERVER_DOCUMENTATION.md with complete implementation guide
- Create MCP_TOOLS_REFERENCE.md with quick reference for all tools
- Add extensive code comments for key structures and functions
- Document architecture, features, extension guide, and best practices
- Include usage examples and troubleshooting information

This provides complete documentation for developers to understand,
use, and extend the HttpRunner MCP server functionality.
2025-05-30 00:37:26 +08:00
lilong.129
4e77ec4002 fix: replace undefined mapToStruct with parseActionOptions in MCP server
- Replace all mapToStruct calls with parseActionOptions function
- Add parseActionOptions implementation for MCP request parameter parsing
- Remove undefined mapToStruct function that was causing compilation errors
- Standardize parameter names (fromX/fromY/toX/toY -> from_x/from_y/to_x/to_y)
- Add AntiRisk support for TapAbsXY and Drag tools
- Improve parameter validation for Drag tool
- Update corresponding test cases to match new parameter names

This fixes compilation errors and ensures all MCP tools work correctly.
2025-05-29 20:37:14 +08:00
lilong.129
dc20eaa816 fix: resolve global AntiRisk configuration not taking effect 2025-05-29 19:22:23 +08:00
lilong.129
d3011d467e feat: enhance signal handling and graceful shutdown for MCP integration 2025-05-29 00:59:17 +08:00
lilong.129
c5fb391ef5 feat: add global AntiRisk configuration support
- Add AntiRisk field to TConfig struct for global anti-risk switch
- Add SetAntiRisk method to configure global anti-risk setting
- Implement automatic AntiRisk application in mobile UI steps
- Global AntiRisk setting applies to all actions unless explicitly disabled
- Maintains backward compatibility with existing action-level AntiRisk settings
2025-05-29 00:11:34 +08:00
lilong.129
08a8b06578 feat: add MCP config support to hrp run command with priority handling 2025-05-28 23:11:52 +08:00
lilong.129
4ea08b0198 feat: integrate MCP tools with UI actions and improve environment variable inheritance 2025-05-28 22:58:59 +08:00
lilong.129
2fe5b14d63 refactor: integrate and optimize MCP tool calling methods 2025-05-27 21:39:17 +08:00
lilong.129
229fd4678c fix: use localhost instead of 127.0.0.1 2025-05-27 20:16:10 +08:00
张开元
a3328278df fix 2025-05-27 20:13:14 +08:00
张开元
272c2ed1eb fix 2025-05-27 20:11:48 +08:00
lilong.129
866cc0e4d2 feat: implement MCP hooks integration with anti_risk option 2025-05-27 19:46:08 +08:00
lilong.129
f4cc74b3ca docs: update dev instruct 2025-05-27 15:34:41 +08:00
lilong.129
6c60383f70 docs: add architecture 2025-05-27 15:28:03 +08:00
lilong.129
3936c0f487 change: remove unused code 2025-05-27 13:42:51 +08:00
lilong.129
404865ba6b refactor: complete ActionOptions unification and pointer type optimization 2025-05-27 13:34:12 +08:00
lilong.129
7fb966b7ba refactor: improve ActionMethod type safety and eliminate type conversions 2025-05-27 11:49:30 +08:00
lilong.129
466fe39cb9 docs: add comprehensive migration summary for ActionOptions and Request integration
- Document the complete integration process of ActionOptions and Request structures
- Include detailed statistics: 40 tools migrated with 100% test pass rate
- Provide technical implementation details and usage examples
- Record backward compatibility guarantees and migration helpers
- Summarize code quality improvements and performance optimizations
- Outline future development plans and goals

This documentation serves as a complete record of the unification initiative
and provides guidance for future development and maintenance.
2025-05-26 23:13:19 +08:00
lilong.129
a47d65bc4e feat: migrate all remaining MCP tools to use UnifiedActionRequest
- Migrated 39 remaining MCP tools from individual Request structures to UnifiedActionRequest
- All tools now use unifiedReq.GetMCPOptions(ACTION_*) instead of option.NewMCPOptions(Request{})
- Completed the unification of parameter definitions across all 40 MCP tools
- Eliminated duplication between ActionOptions and Request structures
- All tests pass, confirming successful migration

Tools migrated:
- Basic operations: TapAbsXY, TapByOCR, TapByCV, DoubleTapXY
- Device management: ListPackages, ScreenShot, GetScreenSize, PressButton
- App management: LaunchApp, TerminateApp, AppInstall, AppUninstall, AppClear
- Swipe operations: SwipeDirection, SwipeCoordinate, SwipeToTapApp, SwipeToTapText, SwipeToTapTexts
- Input/Navigation: Input, Home, Back, Drag
- Web operations: WebLoginNoneUI, SecondaryClick, HoverBySelector, TapBySelector, SecondaryClickBySelector, WebCloseTab
- System utilities: SetIme, GetSource, ClosePopups
- Sleep operations: SleepMS, SleepRandom
- AI/Task management: AIAction, Finished

This completes the ActionOptions and Request structures integration initiative.
2025-05-26 23:10:58 +08:00
lilong.129
8181e4244a refactor ToolSwipe to delegate to existing tools instead of duplicating logic
- Modified ToolSwipe.ConvertActionToCallToolRequest to delegate to ToolSwipeDirection and ToolSwipeCoordinate
- Removed duplicate parameter handling logic in favor of reusing existing implementations
- Fixed linter error by removing unused variable
- Maintained backward compatibility while reducing code duplication
- All tests pass, confirming the refactoring is successful
2025-05-26 22:42:50 +08:00
lilong.129
6ae4c300c1 add generic swipe tool with auto-detection of direction vs coordinate params
- Added ACTION_Swipe to option/action.go for generic swipe functionality
- Implemented ToolSwipe in mcp_server.go that automatically detects parameter type:
  - String params (up/down/left/right) use direction-based swipe logic
  - Array params [fromX, fromY, toX, toY] use coordinate-based swipe logic
- Added comprehensive test coverage for ToolSwipe in mcp_server_test.go
- Updated tool registration to include the new generic swipe tool
- All tests pass, confirming backward compatibility with existing tools
2025-05-26 22:39:23 +08:00
lilong.129
8895e9e970 merge mcp_tools_test.go into mcp_server_test.go
- Merged all individual MCP tool test functions from mcp_tools_test.go into mcp_server_test.go
- Added require import for additional test assertions
- Removed duplicate TestMCPServer4XTDriver function
- Deleted the original mcp_tools_test.go file
- All 39 MCP tools now have comprehensive unit tests in a single file
- Tests cover tool name, description, options, and request conversion functionality
2025-05-26 22:32:26 +08:00
lilong.129
77f5683f9a fix: remove unnecessary IgnoreNotFoundError and MaxRetryTimes from coordinate-based tap tools
- Removed IgnoreNotFoundError and MaxRetryTimes parameters from TapRequest, TapAbsXYRequest, and DoubleTapXYRequest structures
- Updated corresponding tool implementations to remove references to these non-existent fields
- These parameters are not applicable to coordinate-based operations as they don't involve element searching
- Only OCR/CV-based operations need these error handling parameters

This ensures that only relevant tools have the ignore_NotFoundError functionality,
making the API more consistent and avoiding confusion.
2025-05-26 22:10:08 +08:00