Commit Graph

4631 Commits

Author SHA1 Message Date
lilong.129
88ae8faee1 feat: enhance VLM response parsing and DOUBAO model support
- Fix JSON extraction logic by prioritizing brace counting method
- Add support for DOUBAO string array coordinate format
- Introduce IS_UI_TARS helper function for model type checking
- Add comprehensive tests for JSON parsing and coordinate handling
- Improve error handling with retry delays for LLM service failures
2025-06-10 15:56:13 +08:00
lilong.129
4959c2e47e feat: extractJSONFromContent 2025-06-10 14:08:44 +08:00
lilong.129
7dc0f869be fix: extracts JSON content from various formats in the response 2025-06-10 14:02:41 +08:00
lilong.129
90401eeb78 change: remove unnecessary logs 2025-06-10 13:19:36 +08:00
lilong.129
f5f6d177ab fix: optimize report command to avoid creating timestamp directories
- Implement lazy loading for directory creation in config.go
- Add logFile parameter to InitLogger for better control
- Use dynamic directory existence check instead of flags
- Report command now uses console-only logging to prevent directory creation
- Support both JSON and colorized console output formats
- Maintain backward compatibility for all other commands

Changes:
- config.go: Convert directory paths to getter methods with lazy creation
- logger.go: Add logFile parameter and improve logging control
- cmd/root.go: Detect report command and disable file logging
- uixt/*: Update all references to use new getter methods

Fixes the issue where 'hrp report results/' would create unwanted timestamp directories
2025-06-10 12:06:08 +08:00
lilong.129
6588d95154 fix: 修复 summary.json 中文乱码问题
- 改进 Dump2JSON 函数的文件写入方式,确保 UTF-8 编码正确处理
- 添加文件同步操作防止数据不完整
- 新增 UTF-8 编码测试验证修复效果
- 同步改进 HTML 报告生成的文件写入方式
2025-06-10 11:03:10 +08:00
lilong.129
12cebef3b9 change: set llm timeout to 120s 2025-06-09 22:42:19 +08:00
lilong.129
39acadb0a7 feat: add MCP tools registration to LLM service
- Add RegisterTools method to ILLMService interface
- Create shared MCP to eino tool converter
- Auto-register built-in uixt tools in XTDriver initialization
- Refactor MCPHost to use shared converter
- Add comprehensive test coverage for tool conversion

This enables doubao-1.5-thinking-vision-pro model to access
MCP tools through function calling mechanism.
2025-06-09 22:19:43 +08:00
lilong.129
dd52faef57 refactor: move Call function 2025-06-09 20:52:32 +08:00
lilong.129
f1544d4a5c feat: implement separate log levels for console and file output
- Console logger respects user-specified log level
- File logger always uses DEBUG level to capture all logs
- Add custom leveledMultiWriter for different output levels
- Remove global log level setting for more granular control
2025-06-09 19:16:39 +08:00
lilong.129
533c1f4bff feat: add mcp tool ToolScreenRecord 2025-06-09 17:18:26 +08:00
lilong.129
96da4515a1 feat: optimize test report UI and add LLM usage tracking 2025-06-09 17:04:55 +08:00
lilong.129
e85802cdda feat: add download for summary.json and hrp.log in report.html 2025-06-09 00:29:27 +08:00
lilong.129
a91a10ac13 docs: update cmd docs 2025-06-09 00:06:23 +08:00
lilong.129
cf360c8c46 feat: compress image data for html report 2025-06-08 23:48:23 +08:00
lilong.129
14cef72f5a feat: add model name display in AI actions and optimize HTML report
- Add ModelName field to PlanningResult and SubActionResult
- Update HTML report with improved layout and model name display
- Fix elapsed time setting bug and enhance mobile responsiveness
2025-06-08 22:08:51 +08:00
lilong.129
660e8ca124 feat: add mcp tool ToolGetForegroundApp 2025-06-08 19:25:09 +08:00
lilong.129
b9de3cf7a3 refactor: simplify AI action execution and improve sub-action handling 2025-06-08 19:16:37 +08:00
lilong.129
bdf64a08aa feat: enhance HTML report with statistics and collapsible log fields 2025-06-08 10:05:30 +08:00
lilong.129
f2607f7664 style: optimize log display for more compact layout
- Move log message to same line as timestamp and level
- Reduce padding and font sizes for tighter spacing
- Optimize log data display with left border and indentation
- Add responsive design for mobile devices
- Achieve more compact display with fewer lines per log entry
2025-06-08 09:34:21 +08:00
lilong.129
5f7698c6b4 fix: improve Chinese character display in HTML reports
- Fix JSON serialization to preserve Chinese characters instead of Unicode escaping
- Use SetEscapeHTML(false) in toJSON template function
- Apply safeHTML to prevent HTML entity encoding of Chinese text
- Now displays {"text":"连了又连"} instead of {"text":"连了又连"}
2025-06-08 09:29:41 +08:00
lilong.129
4053cc9985 feat: add comprehensive HTML report generation with log filtering
- Add complete HTML report generator with template-based rendering
- Implement log time filtering for step-specific logs
- Support responsive design and interactive UI features
- Consolidate duplicate report implementations
2025-06-08 09:23:14 +08:00
lilong.129
ec4f1eb68a refactor: unify action execution interface and merge AI action handling 2025-06-07 23:59:07 +08:00
lilong.129
fcf3009c67 fix: abnormal indent in summary.json 2025-06-07 20:45:35 +08:00
lilong.129
e75edf8400 feat: add log file output to results/taskID directory 2025-06-07 16:52:41 +08:00
lilong.129
604eed3340 refactor: optimize runner error handling and cleanup logic
- Use defer for summary saving and HTML report generation to ensure they run regardless of exit path
- Remove unnecessary sync.Once for cleanup operations since defer guarantees single execution
- Simplify error handling logic by removing redundant runErr checks
- Improve interrupt handling with better logging messages
- Ensure graceful cleanup and data persistence even when interrupted
2025-06-07 16:36:53 +08:00
lilong.129
460570f651 fix(uixt): fix uixt__input not working and add comprehensive unit tests
- Fix parameter mapping issue where AI model's 'content' parameter wasn't mapped to 'text' field
- Add mapParameterName function to handle parameter name mapping (content->text, key->keycode)
- Add comprehensive unit tests for convertProcessedArgs and mapParameterName functions
- Update existing test cases to match new parameter format (x,y for single coords, from_x,from_y,to_x,to_y for drag)

This resolves the issue where uixt__input action was not working due to parameter name mismatch.
2025-06-07 15:03:29 +08:00
lilong.129
334c0dc141 fix: 修复移动端步骤包含 validate 时验证器不执行的问题 2025-06-06 22:18:43 +08:00
lilong.129
484eebdefd feat: implement multi-model service configuration support
- Support configuring multiple LLM services simultaneously
- Auto-derive model names from service types to simplify configuration
- Maintain backward compatibility with existing configurations
- Refactor configuration logic into dedicated env module
- Add comprehensive unit test coverage
- Update documentation with new configuration approach
2025-06-06 22:17:59 +08:00
lilong.129
b642ea004e feat: implement UI automation test history isolation
- Add ResetHistory option to PlanningOptions and ActionOptions
- Implement task completion detection with isTaskFinished() method
- Add executeActions() method to separate action execution logic
- Modify ConversationHistory.Clear() to completely clear all messages including system message
- Refactor StartToGoal() to automatically reset history on first attempt
- Add WithResetHistory() option function for consistent API
- Consolidate test files into driver_ext_ai_test.go with comprehensive test coverage
2025-06-06 15:29:42 +08:00
lilong.129
6e1bd5bbe2 feat: optimize MCP tools response format with automatic schema generation
- Remove all manual ReturnSchema() methods from tools
- Implement automatic schema generation using reflection
- Unify response format to flat structure with action/success/message fields
- Simplify tool implementation by removing MCPResponse embedding
- Update documentation to reflect new architecture
- Achieve ~70% code reduction while maintaining type safety
2025-06-05 23:17:06 +08:00
lilong.129
56831845ca change: fix logs 2025-06-05 20:26:18 +08:00
lilong.129
5f400735fc fix: 修复 StartToGoal 命令无法通过 CTRL+C 中断的问题
- 为 AI 相关方法添加 context.Context 参数支持中断

- 在重试循环中添加上下文取消检查

- 创建可取消的上下文并监听中断信号

- 更新 MCP 工具调用使用带上下文的方法

现在用户可以通过 CTRL+C 正常中断长时间运行的 AI 自动化任务
2025-06-05 20:00:20 +08:00
lilong.129
d883aa6a21 change: rename VLM name 2025-06-05 18:09:25 +08:00
lilong.129
8cdc71d90b change: RoundToOneDecimal 2025-06-05 17:47:29 +08:00
lilong.129
c4e7ab00a7 feat: implement ToolStartToGoal and fix LLM service initialization
- Add ToolStartToGoal implementation with AI-driven goal automation
- Fix LLM service not initialized issue by applying global AI config to XTDriver creation
- Ensure XTDriver is created with proper AI services from the first initialization
- Add StartToGoal method to StepMobile for goal-oriented automation
- Register ToolStartToGoal in MCP server and add corresponding action type
- Add comprehensive test case for StartToGoal functionality
- Fix ReturnSchema consistency across AI tools (StartToGoal, AIAction, Finished)
- Extract AI service options in MCP argument processing

This resolves the root cause where XTDriver was created without AI services
in runStepMobileUI, ensuring only one XTDriver initialization with complete
AI service configuration.
2025-06-05 16:52:11 +08:00
lilong.129
0add3231ff refactor: merge ActionSummary and Thought fields to eliminate duplication
- Remove redundant ActionSummary field from PlanningResult struct
- Update parsers to use unified Thought field instead of duplicate fields
- Modify chat interface to display Thought instead of ActionSummary
- Update planner logging to use thought instead of summary
- Adjust prompt templates to use thought field consistently
- Switch test LLM service from UI-TARS to DoubaoVL
- Add default parameter handling for sleep tool
2025-06-05 14:19:09 +08:00
lilong.129
0864f74021 fix: update AI parser to use doubao-1.5-thinking-vision-pro configuration 2025-06-05 13:28:31 +08:00
lilong.129
c204542f1f feat: optimize UI-TARS parser with coordinate conversion and action mapping
- Add action mapping for UI-TARS parser to convert action names to option.ActionName
- Implement bounding box to center point coordinate conversion for better accuracy
- Update coordinate normalization to handle coordinates > 1000 properly
- Enhance test cases to verify coordinate scaling and center point conversion
- Improve action argument processing with proper coordinate transformation
- Add comprehensive test coverage for coordinate conversion edge cases

Key improvements:
- Bounding box [x1,y1,x2,y2] now converts to center point [cx,cy] for actions
- Coordinate scaling properly handles different screen resolutions
- Action names are mapped through doubao_1_5_ui_tars_action_mapping
- Enhanced error handling for invalid coordinate formats
2025-06-04 23:16:14 +08:00
lilong.129
1df529ecaa merge master 2025-06-03 18:20:55 +08:00
lilong.129
798e66abe6 Merge branch 'master' into mcp-plugin
Resolve merge conflicts and integrate latest changes from master branch.
2025-06-03 18:18:47 +08:00
lilong.129
bd8cb5abf4 refactor: move MobileAction to option package and update imports
- Move MobileAction struct from uixt package to uixt/option package
- Delete uixt/driver_action.go file as MobileAction is now in option package
- Update all import statements across the codebase to use option.MobileAction
- Update ActionTool interface to use option.MobileAction in ConvertActionToCallToolRequest method
- Maintain backward compatibility while improving package organization
- Clean up code structure by consolidating action-related types in option package

Files affected:
- server/uixt.go: Updated imports and type references
- step.go: Updated imports and ActionResult struct
- step_ui.go: Updated all MobileAction references to option.MobileAction
- uixt/mcp_server.go: Updated ActionTool interface and removed detailed comments
- uixt/mcp_server_test.go: Updated all test cases to use option.MobileAction
- uixt/mcp_tools_*.go: Updated ConvertActionToCallToolRequest method signatures
- uixt/option/action.go: Added MobileAction struct definition
- uixt/sdk.go: Updated ExecuteAction method signature
2025-06-03 18:15:28 +08:00
lilong.129
1cc4b1cf5b refactor: modularize MCP server tools by functionality
- Split large mcp_server.go into modular files by functionality
- Create dedicated files for each tool category:
  - mcp_tools_device.go: Device management tools
  - mcp_tools_touch.go: Touch operation tools  
  - mcp_tools_swipe.go: Swipe and drag operation tools
  - mcp_tools_input.go: Input and IME tools
  - mcp_tools_button.go: Button operation tools
  - mcp_tools_app.go: Application management tools
  - mcp_tools_screen.go: Screen operation tools
  - mcp_tools_utility.go: Utility tools (sleep, popups)
  - mcp_tools_web.go: Web operation tools
  - mcp_tools_ai.go: AI-driven operation tools
- Update mcp_server.md documentation to reflect modular architecture
- Maintain pure ActionTool architecture with complete tool decoupling
- Improve code organization and maintainability
2025-06-03 15:45:42 +08:00
lilong.129
37028c4263 feat(mcphost): optimize shutdown logging to avoid false error messages
- Add identification for normal shutdown pipe errors in startStdioLog
- Optimize stdio log error handling logic to distinguish between normal shutdown and actual errors
- Add proper handling for SIGTERM (exit status 143) in isSignalError function
- Add debug logging for MCP config loading process
- Ensure clean shutdown without confusing error messages
2025-06-03 13:21:38 +08:00
lilong.129
184081592c feat: add global .env file support from ~/.hrp/.env
- Add support for loading environment variables from ~/.hrp/.env
- Implement priority order: current working directory > ~/.hrp/.env > system environment variables
- Use godotenv.Overload() for both global and local .env files to ensure proper priority
- Maintain backward compatibility with existing functionality
- Add comprehensive error handling and logging
2025-06-02 11:52:36 +08:00
lilong.129
9089bd9324 feat: 重构 MCP 工具导出逻辑并完善返回值类型系统 2025-05-31 00:28:24 +08:00
lilong.129
2a392f204b docs: add comprehensive AI module documentation
- Add detailed documentation for HttpRunner AI module
- Cover planning, assertion, computer vision, and session management
- Include architecture design, usage guide, and configuration
- Provide code examples and best practices
- Document all core components and interfaces
2025-05-30 00:45:44 +08:00
lilong.129
f702a3cc78 docs: add comprehensive documentation for MCP server
- Add detailed package-level documentation for mcp_server.go
- Create MCP_SERVER_DOCUMENTATION.md with complete implementation guide
- Create MCP_TOOLS_REFERENCE.md with quick reference for all tools
- Add extensive code comments for key structures and functions
- Document architecture, features, extension guide, and best practices
- Include usage examples and troubleshooting information

This provides complete documentation for developers to understand,
use, and extend the HttpRunner MCP server functionality.
2025-05-30 00:37:26 +08:00
lilong.129
4e77ec4002 fix: replace undefined mapToStruct with parseActionOptions in MCP server
- Replace all mapToStruct calls with parseActionOptions function
- Add parseActionOptions implementation for MCP request parameter parsing
- Remove undefined mapToStruct function that was causing compilation errors
- Standardize parameter names (fromX/fromY/toX/toY -> from_x/from_y/to_x/to_y)
- Add AntiRisk support for TapAbsXY and Drag tools
- Improve parameter validation for Drag tool
- Update corresponding test cases to match new parameter names

This fixes compilation errors and ensures all MCP tools work correctly.
2025-05-29 20:37:14 +08:00
lilong.129
dc20eaa816 fix: resolve global AntiRisk configuration not taking effect 2025-05-29 19:22:23 +08:00