- Fix JSON extraction logic by prioritizing brace counting method
- Add support for DOUBAO string array coordinate format
- Introduce IS_UI_TARS helper function for model type checking
- Add comprehensive tests for JSON parsing and coordinate handling
- Improve error handling with retry delays for LLM service failures
- Implement lazy loading for directory creation in config.go
- Add logFile parameter to InitLogger for better control
- Use dynamic directory existence check instead of flags
- Report command now uses console-only logging to prevent directory creation
- Support both JSON and colorized console output formats
- Maintain backward compatibility for all other commands
Changes:
- config.go: Convert directory paths to getter methods with lazy creation
- logger.go: Add logFile parameter and improve logging control
- cmd/root.go: Detect report command and disable file logging
- uixt/*: Update all references to use new getter methods
Fixes the issue where 'hrp report results/' would create unwanted timestamp directories
- Add RegisterTools method to ILLMService interface
- Create shared MCP to eino tool converter
- Auto-register built-in uixt tools in XTDriver initialization
- Refactor MCPHost to use shared converter
- Add comprehensive test coverage for tool conversion
This enables doubao-1.5-thinking-vision-pro model to access
MCP tools through function calling mechanism.
- Console logger respects user-specified log level
- File logger always uses DEBUG level to capture all logs
- Add custom leveledMultiWriter for different output levels
- Remove global log level setting for more granular control
- Add ModelName field to PlanningResult and SubActionResult
- Update HTML report with improved layout and model name display
- Fix elapsed time setting bug and enhance mobile responsiveness
- Move log message to same line as timestamp and level
- Reduce padding and font sizes for tighter spacing
- Optimize log data display with left border and indentation
- Add responsive design for mobile devices
- Achieve more compact display with fewer lines per log entry
- Fix JSON serialization to preserve Chinese characters instead of Unicode escaping
- Use SetEscapeHTML(false) in toJSON template function
- Apply safeHTML to prevent HTML entity encoding of Chinese text
- Now displays {"text":"连了又连"} instead of {"text":"连了又连"}
- Add complete HTML report generator with template-based rendering
- Implement log time filtering for step-specific logs
- Support responsive design and interactive UI features
- Consolidate duplicate report implementations
- Use defer for summary saving and HTML report generation to ensure they run regardless of exit path
- Remove unnecessary sync.Once for cleanup operations since defer guarantees single execution
- Simplify error handling logic by removing redundant runErr checks
- Improve interrupt handling with better logging messages
- Ensure graceful cleanup and data persistence even when interrupted
- Fix parameter mapping issue where AI model's 'content' parameter wasn't mapped to 'text' field
- Add mapParameterName function to handle parameter name mapping (content->text, key->keycode)
- Add comprehensive unit tests for convertProcessedArgs and mapParameterName functions
- Update existing test cases to match new parameter format (x,y for single coords, from_x,from_y,to_x,to_y for drag)
This resolves the issue where uixt__input action was not working due to parameter name mismatch.
- Support configuring multiple LLM services simultaneously
- Auto-derive model names from service types to simplify configuration
- Maintain backward compatibility with existing configurations
- Refactor configuration logic into dedicated env module
- Add comprehensive unit test coverage
- Update documentation with new configuration approach
- Add ResetHistory option to PlanningOptions and ActionOptions
- Implement task completion detection with isTaskFinished() method
- Add executeActions() method to separate action execution logic
- Modify ConversationHistory.Clear() to completely clear all messages including system message
- Refactor StartToGoal() to automatically reset history on first attempt
- Add WithResetHistory() option function for consistent API
- Consolidate test files into driver_ext_ai_test.go with comprehensive test coverage
- Remove all manual ReturnSchema() methods from tools
- Implement automatic schema generation using reflection
- Unify response format to flat structure with action/success/message fields
- Simplify tool implementation by removing MCPResponse embedding
- Update documentation to reflect new architecture
- Achieve ~70% code reduction while maintaining type safety
- Add ToolStartToGoal implementation with AI-driven goal automation
- Fix LLM service not initialized issue by applying global AI config to XTDriver creation
- Ensure XTDriver is created with proper AI services from the first initialization
- Add StartToGoal method to StepMobile for goal-oriented automation
- Register ToolStartToGoal in MCP server and add corresponding action type
- Add comprehensive test case for StartToGoal functionality
- Fix ReturnSchema consistency across AI tools (StartToGoal, AIAction, Finished)
- Extract AI service options in MCP argument processing
This resolves the root cause where XTDriver was created without AI services
in runStepMobileUI, ensuring only one XTDriver initialization with complete
AI service configuration.
- Remove redundant ActionSummary field from PlanningResult struct
- Update parsers to use unified Thought field instead of duplicate fields
- Modify chat interface to display Thought instead of ActionSummary
- Update planner logging to use thought instead of summary
- Adjust prompt templates to use thought field consistently
- Switch test LLM service from UI-TARS to DoubaoVL
- Add default parameter handling for sleep tool
- Add action mapping for UI-TARS parser to convert action names to option.ActionName
- Implement bounding box to center point coordinate conversion for better accuracy
- Update coordinate normalization to handle coordinates > 1000 properly
- Enhance test cases to verify coordinate scaling and center point conversion
- Improve action argument processing with proper coordinate transformation
- Add comprehensive test coverage for coordinate conversion edge cases
Key improvements:
- Bounding box [x1,y1,x2,y2] now converts to center point [cx,cy] for actions
- Coordinate scaling properly handles different screen resolutions
- Action names are mapped through doubao_1_5_ui_tars_action_mapping
- Enhanced error handling for invalid coordinate formats
- Move MobileAction struct from uixt package to uixt/option package
- Delete uixt/driver_action.go file as MobileAction is now in option package
- Update all import statements across the codebase to use option.MobileAction
- Update ActionTool interface to use option.MobileAction in ConvertActionToCallToolRequest method
- Maintain backward compatibility while improving package organization
- Clean up code structure by consolidating action-related types in option package
Files affected:
- server/uixt.go: Updated imports and type references
- step.go: Updated imports and ActionResult struct
- step_ui.go: Updated all MobileAction references to option.MobileAction
- uixt/mcp_server.go: Updated ActionTool interface and removed detailed comments
- uixt/mcp_server_test.go: Updated all test cases to use option.MobileAction
- uixt/mcp_tools_*.go: Updated ConvertActionToCallToolRequest method signatures
- uixt/option/action.go: Added MobileAction struct definition
- uixt/sdk.go: Updated ExecuteAction method signature
- Add identification for normal shutdown pipe errors in startStdioLog
- Optimize stdio log error handling logic to distinguish between normal shutdown and actual errors
- Add proper handling for SIGTERM (exit status 143) in isSignalError function
- Add debug logging for MCP config loading process
- Ensure clean shutdown without confusing error messages
- Add support for loading environment variables from ~/.hrp/.env
- Implement priority order: current working directory > ~/.hrp/.env > system environment variables
- Use godotenv.Overload() for both global and local .env files to ensure proper priority
- Maintain backward compatibility with existing functionality
- Add comprehensive error handling and logging
- Add detailed documentation for HttpRunner AI module
- Cover planning, assertion, computer vision, and session management
- Include architecture design, usage guide, and configuration
- Provide code examples and best practices
- Document all core components and interfaces
- Add detailed package-level documentation for mcp_server.go
- Create MCP_SERVER_DOCUMENTATION.md with complete implementation guide
- Create MCP_TOOLS_REFERENCE.md with quick reference for all tools
- Add extensive code comments for key structures and functions
- Document architecture, features, extension guide, and best practices
- Include usage examples and troubleshooting information
This provides complete documentation for developers to understand,
use, and extend the HttpRunner MCP server functionality.
- Replace all mapToStruct calls with parseActionOptions function
- Add parseActionOptions implementation for MCP request parameter parsing
- Remove undefined mapToStruct function that was causing compilation errors
- Standardize parameter names (fromX/fromY/toX/toY -> from_x/from_y/to_x/to_y)
- Add AntiRisk support for TapAbsXY and Drag tools
- Improve parameter validation for Drag tool
- Update corresponding test cases to match new parameter names
This fixes compilation errors and ensures all MCP tools work correctly.