feat: optimize UI-TARS parser with coordinate conversion and action mapping

- Add action mapping for UI-TARS parser to convert action names to option.ActionName
- Implement bounding box to center point coordinate conversion for better accuracy
- Update coordinate normalization to handle coordinates > 1000 properly
- Enhance test cases to verify coordinate scaling and center point conversion
- Improve action argument processing with proper coordinate transformation
- Add comprehensive test coverage for coordinate conversion edge cases

Key improvements:
- Bounding box [x1,y1,x2,y2] now converts to center point [cx,cy] for actions
- Coordinate scaling properly handles different screen resolutions
- Action names are mapped through doubao_1_5_ui_tars_action_mapping
- Enhanced error handling for invalid coordinate formats
This commit is contained in:
lilong.129
2025-06-04 22:39:17 +08:00
parent 1df529ecaa
commit c204542f1f
10 changed files with 386 additions and 411 deletions

View File

@@ -25,7 +25,7 @@ type Router struct {
}
func (r *Router) InitMCPHost(configPath string) error {
mcpHost, err := mcphost.NewMCPHost(configPath, false)
mcpHost, err := mcphost.NewMCPHost(configPath, true)
if err != nil {
log.Error().Err(err).Msg("init MCP host failed")
return err