Update core documentation with v5.0.0 features and AI integration

Co-authored-by: debugtalk <3657070+debugtalk@users.noreply.github.com>
2026-06-23 08:33:45 +08:00 · 2025-08-17 04:36:13 +00:00
parent 87f3ce9206
commit 5442930e5a
4 changed files with 242 additions and 19 deletions
--- a/docs/CHANGELOG.md
+++ b/docs/CHANGELOG.md
@@ -1,5 +1,59 @@
 # Release History

+## v5.0.0 (2025-08-03)
+
+**Major Release - HttpRunner v5**
+
+This is a major release that introduces significant architectural improvements and new features, including AI integration and enhanced UI automation capabilities.
+
+### 🎉 New Features
+
+**AI Integration**
+- feat: integrate large language models (LLM) for intelligent test automation
+- feat: support multiple AI service providers (OpenAI GPT-4O, 豆包模型, DeepSeek等)
+- feat: AI-powered UI element detection and interaction planning
+- feat: intelligent assertion generation and validation
+- feat: natural language query processing for test scenarios
+
+**MCP (Model Context Protocol) Support**
+- feat: add MCP host functionality for AI model integration
+- feat: support MCP server connections and tool management
+- feat: enable function calling through MCP protocol
+
+**Enhanced UI Automation**
+- feat: unified driver interface for cross-platform UI automation
+- feat: enhanced Android/iOS/Harmony/Browser support
+- feat: AI-powered popup handling and smart interaction
+- feat: improved screenshot and OCR capabilities
+- feat: advanced swipe and tap operations with offset support
+
+**Core Improvements**
+- feat: Function step type for custom function execution
+- feat: Shell step type for system command execution  
+- feat: enhanced parameter and configuration management
+- feat: improved session management and driver caching
+- feat: better error handling and debugging capabilities
+
+### 🔧 Technical Changes
+
+- refactor: migrate to Go modules with v5 namespace
+- refactor: improved driver architecture with extension methods
+- refactor: enhanced configuration system with environment variable support
+- change: updated dependency management and build system
+
+### 🐛 Bug Fixes
+
+- fix: improved stability in UI automation scenarios
+- fix: better handling of device connections and timeouts
+- fix: enhanced compatibility across different platforms
+
+### 📚 Documentation
+
+- docs: comprehensive update of all documentation for v5 features
+- docs: new AI integration guides and best practices
+- docs: updated architecture documentation
+- docs: enhanced developer instructions and examples
+
 ## v4.3.9 (2024-01-18)

 - feat: add Shell step type
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -2,7 +2,7 @@

 ## 项目概述

-HttpRunner v5 是一个开源的通用测试框架，采用 Go 语言编写，支持 API 接口测试、性能测试和 UI 自动化测试。项目融入了大模型技术，支持 Android/iOS/Harmony/Browser 多种系统平台的 UI 自动化测试。
+HttpRunner v5 是一个开源的通用测试框架，采用 Go 语言编写，支持 API 接口测试、性能测试和 UI 自动化测试。v5 版本融入了大模型技术，支持 Android/iOS/Harmony/Browser 多种系统平台的 UI 自动化测试，并引入了 AI 集成和 MCP (Model Context Protocol) 支持。

 ## 核心架构

@@ -13,14 +13,33 @@ httprunner/
 ├── cmd/                    # 命令行工具入口
 ├── internal/               # 内部模块
 ├── pkg/                    # 公共包
-├── uixt/                   # UI 测试扩展模块
+├── uixt/                   # UI 测试扩展模块 (v5 重点)
 ├── server/                 # HTTP 服务器模块
-├── mcphost/                # MCP (Model Context Protocol) 主机模块
+├── mcphost/                # MCP (Model Context Protocol) 主机模块 (v5 新增)
 ├── examples/               # 示例代码
 ├── tests/                  # 测试用例
 └── docs/                   # 文档
 ```

+## v5 版本主要特性
+
+### 🤖 AI 集成
+- 支持多种大语言模型 (OpenAI GPT-4O, 豆包模型, DeepSeek等)
+- AI 驱动的 UI 操作规划和执行
+- 智能断言和查询功能
+- 自然语言测试步骤描述
+
+### 🔌 MCP 支持
+- Model Context Protocol 主机功能
+- 标准化 AI 模型交互
+- 工具注册和调用管理
+
+### 📱 增强的 UI 自动化
+- 统一的跨平台驱动接口
+- 支持 Android/iOS/Harmony/Browser
+- AI 驱动的智能操作
+- 增强的截图和 OCR 能力
+
 ## 详细模块分析

 ### 1. 命令行模块 (cmd/)
@@ -130,14 +149,26 @@ httprunner/
 - `context.go` - 上下文管理
 - `model.go` - 数据模型

-### 7. MCP 主机模块 (mcphost/)
+### 7. MCP 主机模块 (mcphost/) - v5 新增

-**功能**: 实现 Model Context Protocol 主机功能，支持大模型集成
+**功能**: 实现 Model Context Protocol 主机功能，支持 AI 模型集成
+
+**主要文件**:
+- `host.go` - MCP 主机核心实现
+- `config.go` - MCP 配置管理
+- `chat.go` - 聊天和对话功能
+- `dump.go` - 数据导出功能

 **特点**:
- 独立的 Git 仓库子模块
- 提供与大模型的通信接口
- 支持自然语言驱动的测试场景
+- 支持多种 MCP 服务器连接
+- 提供标准化的 AI 模型交互接口
+- 支持工具注册和函数调用
+- 集成到测试运行流程中
+
+**使用场景**:
+- AI 驱动的测试场景生成
+- 自然语言测试步骤描述
+- 智能化测试结果分析

 ### 8. 配置和解析模块

--- a/docs/dev-instruct.md
+++ b/docs/dev-instruct.md
@@ -26,19 +26,26 @@ type IStep interface {
 }
 ```

-我们只需遵循 `IStep` 的接口定义，即可实现各种类型的测试步骤类型。当前 hrp 已支持的步骤类型包括：
+我们只需遵循 `IStep` 的接口定义，即可实现各种类型的测试步骤类型。当前 HttpRunner v5 已支持的步骤类型包括：

- [request](step_request.go)：发起单次 HTTP 请求
+**协议测试步骤**
+- [request](step_request.go)：发起单次 HTTP/HTTP2 请求
 - [api](step_api.go)：引用执行其它 API 文件
 - [testcase](step_testcase.go)：引用执行其它测试用例文件
- [thinktime](step_thinktime.go)：思考时间，按照配置的逻辑进行等待
+- [websocket](step_websocket.go)：WebSocket 通信
+
+**性能测试步骤**
 - [transaction](step_transaction.go)：事务机制，用于压测
 - [rendezvous](step_rendezvous.go)：集合点机制，用于压测
- [websocket](step_websocket.go)：WebSocket 通信
- [android](step_ui.go)：Android UI 自动化
- [ios](step_ui.go)：iOS UI 自动化
- [harmony](step_ui.go)：Harmony UI 自动化
- [browser](step_ui.go)：浏览器 UI 自动化
+- [thinktime](step_thinktime.go)：思考时间，按照配置的逻辑进行等待
+
+**UI 自动化步骤**
+- [android](step_ui.go)：Android UI 自动化（支持 ADB 和 UIAutomator2）
+- [ios](step_ui.go)：iOS UI 自动化（基于 WebDriverAgent）
+- [harmony](step_ui.go)：Harmony UI 自动化（基于 HDC）
+- [browser](step_ui.go)：浏览器 UI 自动化（支持 Chrome、Firefox、Safari、Edge）
+
+**系统集成步骤**
 - [shell](step_shell.go)：执行 shell 命令
 - [function](step_function.go)：自定义函数调用

@@ -328,7 +335,112 @@ func (s *StepMobileUIValidation) AssertAppNotInForeground(packageName string, ms
 2. 在 `StepMobile` 中添加新的平台字段
 3. 在 `obj()` 方法中添加对应的处理逻辑

-### 3. 调试技巧
+### 3. AI 集成
+
+### 3. AI 集成
+
+HttpRunner v5 引入了完整的 AI 集成功能，支持多种大语言模型进行智能化测试自动化：
+
+**支持的 AI 服务**：
+- **OpenAI GPT-4O**: 通过 OpenAI API 接口
+- **豆包模型**: 豆包思维视觉专业版、豆包 UI-TARS 等
+- **DeepSeek 模型**: DeepSeek V3 等
+- **自定义模型**: 支持兼容 OpenAI API 的其他模型
+
+**AI 功能模块**：
+- **规划器 (Planner)**: 基于屏幕截图和用户意图，生成操作计划
+- **断言器 (Asserter)**: 智能验证界面状态和内容
+- **查询器 (Querier)**: 处理自然语言查询，提取界面信息
+
+**配置方式**：
+```go
+// 环境变量配置
+OPENAI_GPT_4O_API_KEY=your_api_key
+DOUBAO_1_5_THINKING_VISION_PRO_250428_API_KEY=your_doubao_key
+
+// 代码配置
+aiOptions := []option.AIServiceOption{
+    option.WithAIService(option.OPENAI_GPT_4O),
+    option.WithLLMServiceConfig(&option.LLMServiceConfig{
+        PlannerModel:  option.OPENAI_GPT_4O,
+        AsserterModel: option.DOUBAO_1_5_THINKING_VISION_PRO_250428,
+        QuerierModel:  option.DEEPSEEK_V3,
+    }),
+}
+```
+
+### 4. MCP (Model Context Protocol) 支持
+
+v5 版本引入了 MCP 主机功能，支持与 AI 模型进行标准化交互：
+
+**主要功能**：
+- MCP 服务器连接管理
+- 工具注册和调用
+- 标准化的 AI 模型交互协议
+
+**使用方式**：
+```bash
+# 启动 MCP 主机
+hrp mcp-server --config mcp-config.yaml
+
+# 在测试中使用 MCP
+hrp run test.yaml --mcp-config mcp-config.yaml
+```
+
+### 5. 调试技巧
+
+- 启用详细日志：`--log-level debug`
+- 使用屏幕截图功能进行 UI 调试
+- 利用 `--dry-run` 模式验证测试用例
+- 通过 MCP 配置调试 AI 集成问题
+
+**支持的 AI 服务**：
+- **OpenAI GPT-4O**: 通过 OpenAI API 接口
+- **豆包模型**: 豆包思维视觉专业版、豆包 UI-TARS 等
+- **DeepSeek 模型**: DeepSeek V3 等
+- **自定义模型**: 支持兼容 OpenAI API 的其他模型
+
+**AI 功能模块**：
+- **规划器 (Planner)**: 基于屏幕截图和用户意图，生成操作计划
+- **断言器 (Asserter)**: 智能验证界面状态和内容
+- **查询器 (Querier)**: 处理自然语言查询，提取界面信息
+
+**配置方式**：
+```go
+// 环境变量配置
+OPENAI_GPT_4O_API_KEY=your_api_key
+DOUBAO_1_5_THINKING_VISION_PRO_250428_API_KEY=your_doubao_key
+
+// 代码配置
+aiOptions := []option.AIServiceOption{
+    option.WithAIService(option.OPENAI_GPT_4O),
+    option.WithLLMServiceConfig(&option.LLMServiceConfig{
+        PlannerModel:  option.OPENAI_GPT_4O,
+        AsserterModel: option.DOUBAO_1_5_THINKING_VISION_PRO_250428,
+        QuerierModel:  option.DEEPSEEK_V3,
+    }),
+}
+```
+
+### 4. MCP (Model Context Protocol) 支持
+
+v5 版本引入了 MCP 主机功能，支持与 AI 模型进行标准化交互：
+
+**主要功能**：
+- MCP 服务器连接管理
+- 工具注册和调用
+- 标准化的 AI 模型交互协议
+
+**使用方式**：
+```bash
+# 启动 MCP 主机
+hrp mcp-server --config mcp-config.yaml
+
+# 在测试中使用 MCP
+hrp run test.yaml --mcp-config mcp-config.yaml
+```
+
+### 5. 增强的 UI 自动化

 - 使用 `SetRequestsLogOn()` 开启详细的请求日志
 - 使用 `SetPluginLogOn()` 开启插件日志
--- a/docs/uixt/README.md
+++ b/docs/uixt/README.md
@@ -2,7 +2,7 @@

 ## 🚀 概述

-HttpRunner UIXT（UI eXtended Testing）是 HttpRunner v4.3.0+ 引入的跨平台 UI 自动化测试模块，提供统一的 API 接口支持多种平台的 UI 自动化测试，并集成了先进的 AI 能力，实现真正的智能化 UI 自动化测试。
+HttpRunner UIXT（UI eXtended Testing）是 HttpRunner v5.0 的核心 UI 自动化测试模块，提供统一的 API 接口支持多种平台的 UI 自动化测试，并集成了先进的 AI 能力，实现真正的智能化 UI 自动化测试。

 ### 核心特性

@@ -19,7 +19,7 @@ HttpRunner UIXT（UI eXtended Testing）是 HttpRunner v4.3.0+ 引入的跨平

 ```
 ┌─────────────────────────────────────────────────────────────────┐
-│                        HttpRunner UIXT                         │
+│                        HttpRunner UIXT v5                      │
 ├─────────────────────────────────────────────────────────────────┤
 │                      XTDriver (扩展驱动)                        │
 │  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  │
@@ -32,15 +32,41 @@ HttpRunner UIXT（UI eXtended Testing）是 HttpRunner v4.3.0+ 引入的跨平
 │  │  Android Driver │  │   iOS Driver    │  │  Browser Driver │  │
 │  │  (ADB/UIA2)     │  │     (WDA)       │  │   (WebDriver)   │  │
 │  └─────────────────┘  └─────────────────┘  └─────────────────┘  │
+│  ┌─────────────────┐                                            │
+│  │ Harmony Driver  │                                            │
+│  │     (HDC)       │                                            │
+│  └─────────────────┘                                            │
 ├─────────────────────────────────────────────────────────────────┤
 │                        设备层                                   │
 │  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  │
 │  │  Android Device │  │   iOS Device    │  │  Browser Device │  │
 │  │   (真机/模拟器)   │  │   (真机/模拟器)   │  │    (浏览器)      │  │
 │  └─────────────────┘  └─────────────────┘  └─────────────────┘  │
+│  ┌─────────────────┐                                            │
+│  │ Harmony Device  │                                            │
+│  │   (真机/模拟器)   │                                            │
+│  └─────────────────┘                                            │
 └─────────────────────────────────────────────────────────────────┘
 ```

+### v5 版本新特性
+
+#### 🤖 AI 驱动的智能化
+- **LLM 集成**: 支持 OpenAI GPT-4O、豆包模型、DeepSeek 等多种大语言模型
+- **智能规划**: AI 自动分析屏幕截图并生成操作序列
+- **智能断言**: 基于自然语言描述进行断言验证
+- **智能查询**: 从界面中提取结构化信息
+
+#### 🔌 MCP 协议支持
+- **标准化接口**: 基于 Model Context Protocol 的工具调用
+- **可扩展性**: 支持动态注册新的 AI 工具
+- **跨平台兼容**: 统一的 AI 模型交互协议
+
+#### 📱 增强的平台支持
+- **Harmony 平台**: 新增完整的 HarmonyOS 支持
+- **浏览器增强**: 改进的 WebDriver 集成
+- **驱动缓存**: 智能会话管理和资源复用
+
 ### 核心设计思路

 #### 1. 分层架构设计