mirror of
https://github.com/Awuqing/BackupX.git
synced 2026-06-11 20:59:37 +08:00
feat(BackupX): harden agent cluster backup workflow
Squash merge PR #61
This commit is contained in:
@@ -8,6 +8,8 @@ description: File, MySQL, PostgreSQL, SQLite and SAP HANA — what they back up
|
||||
|
||||
BackupX supports five built-in backup types. Type determines which runner executes the job.
|
||||
|
||||
When a task is routed to a remote Agent, the source tools and paths are resolved on that Agent host. Multi-target uploads are still tracked per storage target; if at least one target succeeds, the backup record is marked successful and the per-target result table shows partial failures.
|
||||
|
||||
## File / Directory
|
||||
|
||||
Tars (and optionally gzips) one or more filesystem paths.
|
||||
|
||||
@@ -62,6 +62,8 @@ The script runs automatically and:
|
||||
5. Runs `systemctl enable --now backupx-agent`
|
||||
6. Polls `/api/v1/agent/self` until the master confirms `status: online` (up to 30 s)
|
||||
|
||||
Docker mode uses the same `BACKUPX_AGENT_MASTER`, `BACKUPX_AGENT_TOKEN`, and `BACKUPX_AGENT_TEMP_DIR=/var/lib/backupx-agent/tmp` environment contract. After starting the container, the installer also probes `/api/v1/agent/self`; if the node does not come online, it prints `docker ps` and `docker logs --tail=100 backupx-agent` diagnostics before exiting non-zero.
|
||||
|
||||
If you choose the URL-based fallback command and `curl` prints HTML or the shell reports `Syntax error: newline unexpected`, the install URL is being served by the web console instead of the backend. Ensure either `/api/install/` or `/install/` is forwarded to the BackupX backend, or use the embedded command generated by the console.
|
||||
|
||||
Reruns are idempotent — to upgrade or re-provision, simply generate a new install command and run it again. The one-time install link expires after its TTL or after first consumption, whichever is sooner.
|
||||
@@ -81,9 +83,15 @@ In the **Backup Tasks** page, pick the target node when creating the task. When
|
||||
- Local (`nodeId=0`) → Master executes in-process
|
||||
- Remote node → Master enqueues the command → Agent claims → Agent runs locally → uploads → reports back
|
||||
|
||||
The node table shows the Agent health and command queue state: pending/dispatched depth, running long commands, timeouts, oldest active command age, and the latest Agent-side error. The same queue depth, running-command, and timeout snapshots are exported as Prometheus metrics:
|
||||
|
||||
- `backupx_agent_command_queue_depth`
|
||||
- `backupx_agent_command_running`
|
||||
- `backupx_agent_command_timeout_total`
|
||||
|
||||
## Known limitations
|
||||
|
||||
- **Encrypted backups don't work via Agent** — the Agent doesn't hold Master's AES-256 key. Tasks with `encrypt: true` will fail if routed to an Agent
|
||||
- **Encrypted backups are Master-only** — the Agent doesn't hold Master's AES-256 key. Creating or updating a task with `encrypt: true` and a remote node or node pool is rejected up front
|
||||
- **Directory browser timeout** — remote dir listing is a synchronous RPC through the queue (15s default)
|
||||
- **Dispatched command timeout** — claimed-but-unfinished commands are marked `timeout` after 10 minutes
|
||||
|
||||
|
||||
@@ -42,6 +42,8 @@ Go to **Backup Tasks → New**. Three steps:
|
||||
2. **Source** — paths for file backup (multi-source supported), or connection info for databases
|
||||
3. **Storage & policy** — pick target(s), compression, retention days, encryption on/off
|
||||
|
||||
For Agent-routed tasks, encryption must stay off because the Agent never receives the Master's encryption key. BackupX rejects remote-node or node-pool tasks with encryption enabled during create/update.
|
||||
|
||||
Save, then click **Run Now** to trigger a test. Live logs stream on the **Backup Records** page.
|
||||
|
||||
:::note
|
||||
|
||||
@@ -8,6 +8,8 @@ description: 文件、MySQL、PostgreSQL、SQLite 和 SAP HANA — 各自的能
|
||||
|
||||
BackupX 支持五种内置备份类型,类型决定了用哪个 runner 执行。
|
||||
|
||||
当任务路由到远程 Agent 时,源路径和外部工具都会在该 Agent 主机上解析。多存储目标上传仍会逐目标记录结果;只要至少一个目标上传成功,备份记录即为成功,详情中的目标结果表会展示部分失败。
|
||||
|
||||
## 文件 / 目录
|
||||
|
||||
打包(可选 gzip)一个或多个文件系统路径。
|
||||
|
||||
@@ -62,6 +62,8 @@ Web 控制台 → **节点管理** → **添加节点**,打开三步向导:
|
||||
5. 执行 `systemctl enable --now backupx-agent`
|
||||
6. 轮询 `/api/v1/agent/self`,直到 Master 确认 `status: online`(最多 30 秒)
|
||||
|
||||
Docker 模式使用同一组环境变量约定:`BACKUPX_AGENT_MASTER`、`BACKUPX_AGENT_TOKEN` 和 `BACKUPX_AGENT_TEMP_DIR=/var/lib/backupx-agent/tmp`。容器启动后,安装脚本同样会探测 `/api/v1/agent/self`;如果节点没有上线,会输出 `docker ps` 与 `docker logs --tail=100 backupx-agent` 排查命令,并以非零状态退出。
|
||||
|
||||
如果使用 URL 备用命令时 `curl` 输出 HTML,或 shell 报 `Syntax error: newline unexpected`,说明安装 URL 被 Web 控制台接管而不是转发到后端。需要确保 `/api/install/` 或 `/install/` 至少一个路径能转发到 BackupX 后端,或改用控制台生成的嵌入式命令。
|
||||
|
||||
脚本是幂等的:升级或重装只需重新生成一条安装命令再跑一次。一次性安装链接在 TTL 到期或被首次消费后立即作废。
|
||||
@@ -81,9 +83,15 @@ Web 控制台 → **节点管理** → **添加节点**,打开三步向导:
|
||||
- 本机 / 未指定(`nodeId=0`):Master 进程内直接执行
|
||||
- 远程节点:Master 写入命令队列 → Agent 拉取 → Agent 本地执行 → 上传 → 回报
|
||||
|
||||
节点列表会展示 Agent 健康与命令队列状态:pending/dispatched 深度、运行中的长任务、超时数、最旧活跃命令年龄和最近 Agent 错误。同样的队列深度、运行中命令数和超时快照会导出为 Prometheus 指标:
|
||||
|
||||
- `backupx_agent_command_queue_depth`
|
||||
- `backupx_agent_command_running`
|
||||
- `backupx_agent_command_timeout_total`
|
||||
|
||||
## 已知限制
|
||||
|
||||
- **Agent 不支持加密备份**:Agent 不持有 Master 的 AES-256 密钥。`encrypt: true` 的任务路由到 Agent 时会直接上报失败
|
||||
- **加密备份仅支持 Master 本机执行**:Agent 不持有 Master 的 AES-256 密钥。创建或更新任务时,如果 `encrypt: true` 且选择了远程节点或节点池,会在入口直接拒绝
|
||||
- **目录浏览超时**:远程目录浏览通过命令队列做同步 RPC,默认 15s 超时
|
||||
- **派发命令超时**:Agent 领取但未完成的命令超过 10 分钟会被置 `timeout`
|
||||
|
||||
|
||||
@@ -42,6 +42,8 @@ description: 部署 BackupX、添加存储目标、创建第一个备份任务
|
||||
2. **源配置** — 文件备份选择源路径(支持多个),数据库备份填写连接信息
|
||||
3. **存储与策略** — 选择存储目标(支持多个)、压缩策略、保留天数、是否加密
|
||||
|
||||
对于路由到 Agent 的任务,加密必须关闭,因为 Agent 不会拿到 Master 的加密密钥。BackupX 会在创建/更新阶段拒绝开启加密的远程节点或节点池任务。
|
||||
|
||||
保存后可点击 **立即执行** 测试,**备份记录** 页面实时查看执行日志。
|
||||
|
||||
:::note
|
||||
|
||||
Reference in New Issue
Block a user