mirror of
https://github.com/Awuqing/BackupX.git
synced 2026-05-27 19:19:35 +08:00
feat(BackupX): harden agent cluster backup workflow
Squash merge PR #61
This commit is contained in:
@@ -62,6 +62,8 @@ The script runs automatically and:
|
||||
5. Runs `systemctl enable --now backupx-agent`
|
||||
6. Polls `/api/v1/agent/self` until the master confirms `status: online` (up to 30 s)
|
||||
|
||||
Docker mode uses the same `BACKUPX_AGENT_MASTER`, `BACKUPX_AGENT_TOKEN`, and `BACKUPX_AGENT_TEMP_DIR=/var/lib/backupx-agent/tmp` environment contract. After starting the container, the installer also probes `/api/v1/agent/self`; if the node does not come online, it prints `docker ps` and `docker logs --tail=100 backupx-agent` diagnostics before exiting non-zero.
|
||||
|
||||
If you choose the URL-based fallback command and `curl` prints HTML or the shell reports `Syntax error: newline unexpected`, the install URL is being served by the web console instead of the backend. Ensure either `/api/install/` or `/install/` is forwarded to the BackupX backend, or use the embedded command generated by the console.
|
||||
|
||||
Reruns are idempotent — to upgrade or re-provision, simply generate a new install command and run it again. The one-time install link expires after its TTL or after first consumption, whichever is sooner.
|
||||
@@ -81,9 +83,15 @@ In the **Backup Tasks** page, pick the target node when creating the task. When
|
||||
- Local (`nodeId=0`) → Master executes in-process
|
||||
- Remote node → Master enqueues the command → Agent claims → Agent runs locally → uploads → reports back
|
||||
|
||||
The node table shows the Agent health and command queue state: pending/dispatched depth, running long commands, timeouts, oldest active command age, and the latest Agent-side error. The same queue depth, running-command, and timeout snapshots are exported as Prometheus metrics:
|
||||
|
||||
- `backupx_agent_command_queue_depth`
|
||||
- `backupx_agent_command_running`
|
||||
- `backupx_agent_command_timeout_total`
|
||||
|
||||
## Known limitations
|
||||
|
||||
- **Encrypted backups don't work via Agent** — the Agent doesn't hold Master's AES-256 key. Tasks with `encrypt: true` will fail if routed to an Agent
|
||||
- **Encrypted backups are Master-only** — the Agent doesn't hold Master's AES-256 key. Creating or updating a task with `encrypt: true` and a remote node or node pool is rejected up front
|
||||
- **Directory browser timeout** — remote dir listing is a synchronous RPC through the queue (15s default)
|
||||
- **Dispatched command timeout** — claimed-but-unfinished commands are marked `timeout` after 10 minutes
|
||||
|
||||
|
||||
Reference in New Issue
Block a user