Files
httprunner/docs/uixt/mcp-tools.md
2025-06-11 14:57:08 +08:00

1049 lines
25 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# MCP 工具文档
## 概述
HttpRunner UIXT 基于 Model Context Protocol (MCP) 协议实现了标准化的工具接口,将所有 UI 操作封装为 MCP 工具,支持 AI 模型直接调用,实现真正的智能化 UI 自动化。
## MCP 架构
### 整体架构
```
┌─────────────────────────────────────────────────────────────────┐
│ MCP 生态系统 │
├─────────────────────────────────────────────────────────────────┤
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ MCP Client │ │ MCP Server │ │ Tool Registry │ │
│ │ (AI Model) │◄──►│ (UIXT Server) │◄──►│ (工具注册) │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
├─────────────────────────────────────────────────────────────────┤
│ 工具层 │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Device Tools │ │ Action Tools │ │ AI Tools │ │
│ │ (设备工具) │ │ (操作工具) │ │ (AI工具) │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
├─────────────────────────────────────────────────────────────────┤
│ 底层驱动 │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Android Driver │ │ iOS Driver │ │ Browser Driver │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
```
### 核心组件
#### MCPServer4XTDriver
MCP 协议服务器主体:
```go
type MCPServer4XTDriver struct {
mcpServer *server.MCPServer // MCP 协议服务器
mcpTools []mcp.Tool // 注册的工具列表
actionToolMap map[option.ActionName]ActionTool // 动作到工具的映射
}
```
#### ActionTool 接口
所有 MCP 工具的统一契约:
```go
type ActionTool interface {
Name() option.ActionName // 工具名称
Description() string // 工具描述
Options() []mcp.ToolOption // MCP 选项定义
Implement() server.ToolHandlerFunc // 工具实现逻辑
ConvertActionToCallToolRequest(action MobileAction) (mcp.CallToolRequest, error) // 动作转换
}
```
## 工具分类
### 设备管理工具 (mcp_tools_device.go)
#### list_available_devices
发现可用的设备和模拟器。
```json
{
"name": "uixt__list_available_devices",
"description": "List all available devices including Android devices, iOS devices, and simulators",
"inputSchema": {
"type": "object",
"properties": {},
"required": []
}
}
```
**响应示例**
```json
{
"action": "list_available_devices",
"success": true,
"message": "Found 3 available devices",
"devices": [
{
"platform": "android",
"serial": "emulator-5554",
"name": "Android Emulator",
"status": "online"
}
],
"count": 3
}
```
#### select_device
选择特定的设备进行操作。
```json
{
"name": "uixt__select_device",
"description": "Select a specific device by platform and serial number",
"inputSchema": {
"type": "object",
"properties": {
"platform": {
"type": "string",
"description": "Device platform (android, ios, browser, harmony)"
},
"serial": {
"type": "string",
"description": "Device serial number or identifier"
}
},
"required": ["platform", "serial"]
}
}
```
### 触摸操作工具 (mcp_tools_touch.go)
#### tap_xy
在相对坐标位置点击0-1 范围)。
```json
{
"name": "uixt__tap_xy",
"description": "Tap at relative coordinates (0-1 range)",
"inputSchema": {
"type": "object",
"properties": {
"x": {
"type": "number",
"description": "X coordinate (0-1 range)"
},
"y": {
"type": "number",
"description": "Y coordinate (0-1 range)"
}
},
"required": ["x", "y"]
}
}
```
#### tap_abs_xy
在绝对像素坐标位置点击。
```json
{
"name": "uixt__tap_abs_xy",
"description": "Tap at absolute pixel coordinates",
"inputSchema": {
"type": "object",
"properties": {
"x": {
"type": "number",
"description": "Absolute X coordinate in pixels"
},
"y": {
"type": "number",
"description": "Absolute Y coordinate in pixels"
}
},
"required": ["x", "y"]
}
}
```
#### tap_ocr
通过 OCR 识别文本并点击。
```json
{
"name": "uixt__tap_ocr",
"description": "Find text using OCR and tap on it",
"inputSchema": {
"type": "object",
"properties": {
"text": {
"type": "string",
"description": "Text to find and tap"
},
"regex": {
"type": "boolean",
"description": "Whether to use regex matching"
},
"index": {
"type": "integer",
"description": "Index of text occurrence to tap (0-based)"
}
},
"required": ["text"]
}
}
```
#### tap_cv
通过计算机视觉识别 UI 元素并点击。
```json
{
"name": "uixt__tap_cv",
"description": "Find UI element using computer vision and tap on it",
"inputSchema": {
"type": "object",
"properties": {
"element_type": {
"type": "string",
"description": "Type of UI element to find"
},
"description": {
"type": "string",
"description": "Description of the element"
}
},
"required": ["element_type"]
}
}
```
### 滑动操作工具 (mcp_tools_swipe.go)
#### swipe
通用滑动操作,自动检测方向或坐标。
```json
{
"name": "uixt__swipe",
"description": "Perform swipe gesture with automatic direction or coordinate detection",
"inputSchema": {
"type": "object",
"properties": {
"direction": {
"type": "string",
"description": "Swipe direction (up, down, left, right)"
},
"from_x": {
"type": "number",
"description": "Start X coordinate (0-1 range)"
},
"from_y": {
"type": "number",
"description": "Start Y coordinate (0-1 range)"
},
"to_x": {
"type": "number",
"description": "End X coordinate (0-1 range)"
},
"to_y": {
"type": "number",
"description": "End Y coordinate (0-1 range)"
}
}
}
}
```
#### swipe_to_tap_app
滑动查找并点击应用。
```json
{
"name": "uixt__swipe_to_tap_app",
"description": "Swipe to find and tap on an app",
"inputSchema": {
"type": "object",
"properties": {
"app_name": {
"type": "string",
"description": "Name of the app to find and tap"
},
"max_swipes": {
"type": "integer",
"description": "Maximum number of swipes to perform"
}
},
"required": ["app_name"]
}
}
```
### 输入操作工具 (mcp_tools_input.go)
#### input
在焦点元素上输入文本。
```json
{
"name": "uixt__input",
"description": "Input text into the focused element",
"inputSchema": {
"type": "object",
"properties": {
"text": {
"type": "string",
"description": "Text to input"
}
},
"required": ["text"]
}
}
```
#### set_ime
设置输入法编辑器。
```json
{
"name": "uixt__set_ime",
"description": "Set the Input Method Editor (IME)",
"inputSchema": {
"type": "object",
"properties": {
"ime": {
"type": "string",
"description": "IME package name or identifier"
}
},
"required": ["ime"]
}
}
```
### 按键操作工具 (mcp_tools_button.go)
#### press_button
按设备按键。
```json
{
"name": "uixt__press_button",
"description": "Press a device button",
"inputSchema": {
"type": "object",
"properties": {
"button": {
"type": "string",
"description": "Button name (home, back, volume_up, volume_down, etc.)"
}
},
"required": ["button"]
}
}
```
### 应用管理工具 (mcp_tools_app.go)
#### list_packages
列出所有已安装的应用包。
```json
{
"name": "uixt__list_packages",
"description": "List all installed app packages on the device",
"inputSchema": {
"type": "object",
"properties": {},
"required": []
}
}
```
#### app_launch
启动应用。
```json
{
"name": "uixt__app_launch",
"description": "Launch an app by package name",
"inputSchema": {
"type": "object",
"properties": {
"package_name": {
"type": "string",
"description": "Package name of the app to launch"
}
},
"required": ["package_name"]
}
}
```
#### app_terminate
终止应用。
```json
{
"name": "uixt__app_terminate",
"description": "Terminate a running app",
"inputSchema": {
"type": "object",
"properties": {
"package_name": {
"type": "string",
"description": "Package name of the app to terminate"
}
},
"required": ["package_name"]
}
}
```
### 屏幕操作工具 (mcp_tools_screen.go)
#### screenshot
捕获屏幕截图。
```json
{
"name": "uixt__screenshot",
"description": "Take a screenshot of the device screen",
"inputSchema": {
"type": "object",
"properties": {},
"required": []
}
}
```
**响应示例**
```json
{
"action": "screenshot",
"success": true,
"message": "Screenshot captured successfully",
"screenshot": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...",
"width": 1080,
"height": 1920
}
```
#### get_screen_size
获取屏幕尺寸。
```json
{
"name": "uixt__get_screen_size",
"description": "Get the screen size of the device",
"inputSchema": {
"type": "object",
"properties": {},
"required": []
}
}
```
### 实用工具 (mcp_tools_utility.go)
#### sleep
等待指定秒数。
```json
{
"name": "uixt__sleep",
"description": "Sleep for specified number of seconds",
"inputSchema": {
"type": "object",
"properties": {
"seconds": {
"type": "number",
"description": "Number of seconds to sleep"
}
},
"required": ["seconds"]
}
}
```
#### close_popups
关闭弹窗或对话框。
```json
{
"name": "uixt__close_popups",
"description": "Close popups or dialogs on the screen",
"inputSchema": {
"type": "object",
"properties": {},
"required": []
}
}
```
### Web 操作工具 (mcp_tools_web.go)
#### secondary_click
在指定坐标右键点击。
```json
{
"name": "uixt__secondary_click",
"description": "Perform secondary click (right-click) at coordinates",
"inputSchema": {
"type": "object",
"properties": {
"x": {
"type": "number",
"description": "X coordinate for secondary click"
},
"y": {
"type": "number",
"description": "Y coordinate for secondary click"
}
},
"required": ["x", "y"]
}
}
```
#### hover_by_selector
通过选择器悬停元素。
```json
{
"name": "uixt__hover_by_selector",
"description": "Hover over element by CSS selector or XPath",
"inputSchema": {
"type": "object",
"properties": {
"selector": {
"type": "string",
"description": "CSS selector or XPath of the element"
}
},
"required": ["selector"]
}
}
```
### AI 操作工具 (mcp_tools_ai.go)
#### start_to_goal
使用自然语言描述执行从开始到目标的任务。
```json
{
"name": "uixt__start_to_goal",
"description": "Execute a task from start to goal using natural language description",
"inputSchema": {
"type": "object",
"properties": {
"goal": {
"type": "string",
"description": "Natural language description of the goal"
}
},
"required": ["goal"]
}
}
```
#### ai_action
使用自然语言提示执行 AI 驱动的动作。
```json
{
"name": "uixt__ai_action",
"description": "Execute AI-driven action using natural language prompt",
"inputSchema": {
"type": "object",
"properties": {
"prompt": {
"type": "string",
"description": "Natural language prompt for the action"
}
},
"required": ["prompt"]
}
}
```
## 工具实现
### ActionTool 实现示例
```go
// 点击工具实现
type ToolTapXY struct {
X float64 `json:"x" desc:"X coordinate (0-1 range)"`
Y float64 `json:"y" desc:"Y coordinate (0-1 range)"`
}
func (t *ToolTapXY) Name() option.ActionName {
return option.ActionTapXY
}
func (t *ToolTapXY) Description() string {
return "Tap at relative coordinates (0-1 range)"
}
func (t *ToolTapXY) Options() []mcp.ToolOption {
return []mcp.ToolOption{
{
Name: "x",
Type: "number",
Description: "X coordinate (0-1 range)",
Required: true,
},
{
Name: "y",
Type: "number",
Description: "Y coordinate (0-1 range)",
Required: true,
},
}
}
func (t *ToolTapXY) Implement() server.ToolHandlerFunc {
return func(ctx context.Context, req mcp.CallToolRequest) (*mcp.CallToolResult, error) {
// 解析参数
x, ok := req.Params.Arguments["x"].(float64)
if !ok {
return mcp.NewToolResultError("invalid x coordinate"), nil
}
y, ok := req.Params.Arguments["y"].(float64)
if !ok {
return mcp.NewToolResultError("invalid y coordinate"), nil
}
// 执行操作
err := GetXTDriverFromContext(ctx).TapXY(x, y)
if err != nil {
return mcp.NewToolResultError(fmt.Sprintf("tap failed: %v", err)), nil
}
// 设置响应数据
t.X = x
t.Y = y
return NewMCPSuccessResponse(
fmt.Sprintf("Tapped at coordinates (%.2f, %.2f)", x, y),
t,
), nil
}
}
```
### 响应格式
所有工具使用统一的扁平化响应格式:
```go
func NewMCPSuccessResponse(message string, actionTool ActionTool) *mcp.CallToolResult {
response := map[string]interface{}{
"action": string(actionTool.Name()),
"success": true,
"message": message,
}
// 使用反射提取工具字段
toolValue := reflect.ValueOf(actionTool)
if toolValue.Kind() == reflect.Ptr {
toolValue = toolValue.Elem()
}
toolType := toolValue.Type()
for i := 0; i < toolValue.NumField(); i++ {
field := toolType.Field(i)
jsonTag := field.Tag.Get("json")
if jsonTag != "" && jsonTag != "-" {
fieldName := strings.Split(jsonTag, ",")[0]
response[fieldName] = toolValue.Field(i).Interface()
}
}
return &mcp.CallToolResult{
Content: []mcp.Content{
{
Type: mcp.ContentTypeText,
Text: toJSONString(response),
},
},
}
}
```
## 工具注册
### 服务器初始化
```go
func NewMCPServer() *MCPServer4XTDriver {
server := &MCPServer4XTDriver{
mcpTools: make([]mcp.Tool, 0),
actionToolMap: make(map[option.ActionName]ActionTool),
}
// 注册所有工具
server.registerDeviceTools()
server.registerTouchTools()
server.registerSwipeTools()
server.registerInputTools()
server.registerButtonTools()
server.registerAppTools()
server.registerScreenTools()
server.registerUtilityTools()
server.registerWebTools()
server.registerAITools()
return server
}
```
### 工具注册方法
```go
func (s *MCPServer4XTDriver) registerTool(tool ActionTool) {
// 创建 MCP 工具定义
mcpTool := mcp.Tool{
Name: fmt.Sprintf("uixt__%s", tool.Name()),
Description: tool.Description(),
InputSchema: map[string]interface{}{
"type": "object",
"properties": generateProperties(tool.Options()),
"required": getRequiredFields(tool.Options()),
},
}
// 注册到服务器
s.mcpTools = append(s.mcpTools, mcpTool)
s.actionToolMap[tool.Name()] = tool
}
```
## 工具调用
### 客户端调用
```go
// 通过 MCP 客户端调用工具
func callTool(client client.MCPClient, toolName string, args map[string]interface{}) (*mcp.CallToolResult, error) {
req := mcp.CallToolRequest{
Params: mcp.CallToolParams{
Name: fmt.Sprintf("uixt__%s", toolName),
Arguments: args,
},
}
return client.CallTool(context.Background(), req)
}
// 使用示例
result, err := callTool(client, "tap_xy", map[string]interface{}{
"x": 0.5,
"y": 0.5,
})
```
### 服务器处理
```go
func (s *MCPServer4XTDriver) CallTool(ctx context.Context, req mcp.CallToolRequest) (*mcp.CallToolResult, error) {
// 提取工具名称
toolName := strings.TrimPrefix(req.Params.Name, "uixt__")
actionName := option.ActionName(toolName)
// 查找工具
tool, exists := s.actionToolMap[actionName]
if !exists {
return mcp.NewToolResultError(fmt.Sprintf("tool %s not found", toolName)), nil
}
// 执行工具
handler := tool.Implement()
return handler(ctx, req)
}
```
## 扩展开发
### 创建自定义工具
```go
// 1. 定义工具结构
type ToolCustomAction struct {
Parameter1 string `json:"parameter1" desc:"Description of parameter1"`
Parameter2 int `json:"parameter2" desc:"Description of parameter2"`
}
// 2. 实现 ActionTool 接口
func (t *ToolCustomAction) Name() option.ActionName {
return option.ActionName("custom_action")
}
func (t *ToolCustomAction) Description() string {
return "Perform a custom action"
}
func (t *ToolCustomAction) Options() []mcp.ToolOption {
return []mcp.ToolOption{
{
Name: "parameter1",
Type: "string",
Description: "Description of parameter1",
Required: true,
},
{
Name: "parameter2",
Type: "integer",
Description: "Description of parameter2",
Required: false,
},
}
}
func (t *ToolCustomAction) Implement() server.ToolHandlerFunc {
return func(ctx context.Context, req mcp.CallToolRequest) (*mcp.CallToolResult, error) {
// 解析参数
param1, ok := req.Params.Arguments["parameter1"].(string)
if !ok {
return mcp.NewToolResultError("invalid parameter1"), nil
}
param2, _ := req.Params.Arguments["parameter2"].(float64)
// 执行自定义逻辑
err := performCustomAction(param1, int(param2))
if err != nil {
return mcp.NewToolResultError(fmt.Sprintf("custom action failed: %v", err)), nil
}
// 设置响应数据
t.Parameter1 = param1
t.Parameter2 = int(param2)
return NewMCPSuccessResponse("Custom action completed", t), nil
}
}
// 3. 注册工具
func (s *MCPServer4XTDriver) registerCustomTools() {
s.registerTool(&ToolCustomAction{})
}
```
### 工具分组
```go
// 按功能分组注册工具
func (s *MCPServer4XTDriver) registerToolGroup(groupName string, tools []ActionTool) {
for _, tool := range tools {
// 添加分组前缀
mcpTool := mcp.Tool{
Name: fmt.Sprintf("uixt__%s__%s", groupName, tool.Name()),
Description: fmt.Sprintf("[%s] %s", groupName, tool.Description()),
InputSchema: generateInputSchema(tool),
}
s.mcpTools = append(s.mcpTools, mcpTool)
s.actionToolMap[tool.Name()] = tool
}
}
```
## 最佳实践
### 1. 工具设计原则
```go
// 单一职责:每个工具只做一件事
type ToolSinglePurpose struct {
// 明确的参数定义
TargetText string `json:"target_text" desc:"Text to search for"`
}
// 参数验证:在工具实现中验证参数
func (t *ToolSinglePurpose) Implement() server.ToolHandlerFunc {
return func(ctx context.Context, req mcp.CallToolRequest) (*mcp.CallToolResult, error) {
// 参数验证
if err := t.validateParameters(req.Params.Arguments); err != nil {
return mcp.NewToolResultError(err.Error()), nil
}
// 执行逻辑
return t.execute(ctx, req)
}
}
```
### 2. 错误处理
```go
// 统一的错误处理
func handleToolError(err error, toolName string) *mcp.CallToolResult {
if err == nil {
return nil
}
// 记录错误日志
log.Error().Err(err).Str("tool", toolName).Msg("tool execution failed")
// 返回用户友好的错误信息
return mcp.NewToolResultError(fmt.Sprintf("Tool %s failed: %v", toolName, err))
}
```
### 3. 性能优化
```go
// 工具执行缓存
type ToolCache struct {
cache map[string]*mcp.CallToolResult
mutex sync.RWMutex
}
func (c *ToolCache) GetOrExecute(key string, executor func() (*mcp.CallToolResult, error)) (*mcp.CallToolResult, error) {
c.mutex.RLock()
if result, exists := c.cache[key]; exists {
c.mutex.RUnlock()
return result, nil
}
c.mutex.RUnlock()
// 执行工具
result, err := executor()
if err != nil {
return nil, err
}
// 缓存结果
c.mutex.Lock()
c.cache[key] = result
c.mutex.Unlock()
return result, nil
}
```
### 4. 工具组合
```go
// 复合工具:组合多个基础工具
type ToolComposite struct {
Steps []ToolStep `json:"steps" desc:"Sequence of tool steps"`
}
type ToolStep struct {
Tool string `json:"tool"`
Arguments map[string]interface{} `json:"arguments"`
}
func (t *ToolComposite) Implement() server.ToolHandlerFunc {
return func(ctx context.Context, req mcp.CallToolRequest) (*mcp.CallToolResult, error) {
results := make([]interface{}, 0, len(t.Steps))
for i, step := range t.Steps {
// 执行每个步骤
result, err := executeToolStep(ctx, step)
if err != nil {
return mcp.NewToolResultError(fmt.Sprintf("step %d failed: %v", i+1, err)), nil
}
results = append(results, result)
}
return NewMCPSuccessResponse("Composite tool completed", t), nil
}
}
```
## 故障排除
### 常见问题
#### 工具注册失败
```go
// 检查工具注册
func validateToolRegistration(server *MCPServer4XTDriver) error {
tools := server.ListTools()
if len(tools) == 0 {
return fmt.Errorf("no tools registered")
}
// 检查必需工具
requiredTools := []string{"tap_xy", "screenshot", "app_launch"}
for _, required := range requiredTools {
found := false
for _, tool := range tools {
if strings.HasSuffix(tool.Name, required) {
found = true
break
}
}
if !found {
return fmt.Errorf("required tool %s not found", required)
}
}
return nil
}
```
#### 工具调用失败
```go
// 调试工具调用
func debugToolCall(req mcp.CallToolRequest) {
log.Debug().
Str("tool", req.Params.Name).
Interface("arguments", req.Params.Arguments).
Msg("tool call debug")
// 验证参数类型
for key, value := range req.Params.Arguments {
log.Debug().
Str("param", key).
Str("type", fmt.Sprintf("%T", value)).
Interface("value", value).
Msg("parameter debug")
}
}
```
#### 性能问题
```go
// 监控工具性能
func monitorToolPerformance(toolName string, executor func() (*mcp.CallToolResult, error)) (*mcp.CallToolResult, error) {
start := time.Now()
result, err := executor()
elapsed := time.Since(start)
log.Info().
Str("tool", toolName).
Dur("elapsed", elapsed).
Bool("success", err == nil).
Msg("tool performance")
if elapsed > 5*time.Second {
log.Warn().
Str("tool", toolName).
Dur("elapsed", elapsed).
Msg("slow tool execution")
}
return result, err
}
```
## 参考资料
- [Model Context Protocol 规范](https://modelcontextprotocol.io/docs/)
- [MCP Go 实现](https://github.com/mark3labs/mcp-go)
- [HttpRunner UIXT MCP 服务器文档](mcp_server.md)