Add state backup/restore command for droplet migration#202
Add state backup/restore command for droplet migration#202benvinegar wants to merge 4 commits intomainfrom
Conversation
Greptile SummaryThis PR adds a Key observations:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
actor Operator
participant CLI as baudbot CLI
participant state.sh
participant Python as Python3
participant FS as Agent Home (~/.pi / ~/.config)
Note over Operator,FS: Backup flow (old droplet)
Operator->>CLI: sudo baudbot state backup /tmp/state.zip
CLI->>state.sh: exec (requires root)
state.sh->>state.sh: resolve_archive_path()
state.sh->>FS: copy_path_if_present() for each STATE_PATH
state.sh->>state.sh: write_metadata_file() → metadata.json
state.sh->>Python: create_zip_archive(baudbot-state/ → state.zip)
state.sh->>FS: chmod 600 state.zip
state.sh-->>Operator: ✓ state backup created
Note over Operator,FS: Restore flow (new droplet)
Operator->>CLI: sudo baudbot stop
Operator->>CLI: sudo baudbot state restore /tmp/state.zip
CLI->>state.sh: exec (requires root)
state.sh->>state.sh: service_running() → must be stopped
state.sh->>Python: extract_zip_archive_safe() (path-traversal checked)
state.sh->>Python: validate metadata.json format == "baudbot-state-v1"
state.sh->>FS: cp -a payload/. → BAUDBOT_AGENT_HOME/ (merge)
state.sh->>FS: restore_ownership_if_root() chown -R
state.sh->>FS: restore_secure_permissions() chmod 600
state.sh-->>Operator: ✓ state restored
Operator->>CLI: sudo baudbot start
Prompt To Fix All With AIThis is a comment left during a code review.
Path: bin/baudbot.test.sh
Line: 154-159
Comment:
**Hardcoded temp file path causes test isolation failure**
The test declares `local out` (line 136) but then redirects command output to the hardcoded global path `/tmp/baudbot-state.out` instead of using the already-available `$tmp` directory. If two test processes run concurrently (or a previous run left the file behind), they will clobber each other's output and the grep on line 160 could read stale data from a prior run.
```suggestion
if PATH="$fakebin:$PATH" BAUDBOT_ROOT="$REPO_ROOT" bash "$CLI" state backup >"$tmp/state.out" 2>&1; then
return 1
fi
out="$(cat "$tmp/state.out")"
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: bin/state.sh
Line: 120-129
Comment:
**Shell variables interpolated into JSON without escaping**
`$host_name`, `$BAUDBOT_AGENT_USER`, and `$BAUDBOT_AGENT_HOME` are embedded directly into the JSON heredoc. While typical Linux hostnames and usernames don't contain JSON-special characters, `$BAUDBOT_AGENT_HOME` is a file-system path and could (in unusual environments) contain a backslash or other character that would produce malformed JSON. When `cmd_restore` later calls `json.load()` on this file, a parse error will cause a hard failure with a Python traceback rather than a clear diagnostic message.
Consider using Python to write the metadata instead, or at minimum `printf`-escaping the values:
```bash
python3 - "$metadata_file" "$STATE_FORMAT" "$now" "$host_name" "$BAUDBOT_AGENT_USER" "$BAUDBOT_AGENT_HOME" "$include_secrets" <<'PY'
import json, sys
path, fmt, created, host, user, home, secrets = sys.argv[1:]
with open(path, "w") as f:
json.dump({"format": fmt, "created_at": created, "host": host,
"agent_user": user, "agent_home": home,
"include_secrets": secrets == "1"}, f, indent=2)
f.write("\n")
PY
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: bin/state.sh
Line: 374-375
Comment:
**Restore merges into agent home rather than replacing it**
`cp -a "$payload_root/." "$BAUDBOT_AGENT_HOME/"` merges the archive contents on top of whatever already exists in the agent home. Any files present on the new droplet that are not in the archive (e.g., a fresh `settings.json` written by `deploy`) are silently kept alongside the restored files. This can lead to an inconsistent state that is difficult to diagnose.
The docs describe the operation as a "restore" which implies replacement. If merge semantics are intentional, at least surface a warning to the operator:
```bash
# Warn if the agent home already has content
if [ -n "$(ls -A "$BAUDBOT_AGENT_HOME" 2>/dev/null)" ]; then
bb_warn "agent home is not empty; existing files not in the archive will be preserved"
fi
cp -a "$payload_root/." "$BAUDBOT_AGENT_HOME/"
```
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: 45e3840 |
bin/baudbot.test.sh
Outdated
| if PATH="$fakebin:$PATH" BAUDBOT_ROOT="$REPO_ROOT" bash "$CLI" state backup >/tmp/baudbot-state.out 2>&1; then | ||
| return 1 | ||
| fi | ||
|
|
||
| out="$(cat /tmp/baudbot-state.out)" | ||
| rm -f /tmp/baudbot-state.out |
There was a problem hiding this comment.
Hardcoded temp file path causes test isolation failure
The test declares local out (line 136) but then redirects command output to the hardcoded global path /tmp/baudbot-state.out instead of using the already-available $tmp directory. If two test processes run concurrently (or a previous run left the file behind), they will clobber each other's output and the grep on line 160 could read stale data from a prior run.
| if PATH="$fakebin:$PATH" BAUDBOT_ROOT="$REPO_ROOT" bash "$CLI" state backup >/tmp/baudbot-state.out 2>&1; then | |
| return 1 | |
| fi | |
| out="$(cat /tmp/baudbot-state.out)" | |
| rm -f /tmp/baudbot-state.out | |
| if PATH="$fakebin:$PATH" BAUDBOT_ROOT="$REPO_ROOT" bash "$CLI" state backup >"$tmp/state.out" 2>&1; then | |
| return 1 | |
| fi | |
| out="$(cat "$tmp/state.out")" |
Prompt To Fix With AI
This is a comment left during a code review.
Path: bin/baudbot.test.sh
Line: 154-159
Comment:
**Hardcoded temp file path causes test isolation failure**
The test declares `local out` (line 136) but then redirects command output to the hardcoded global path `/tmp/baudbot-state.out` instead of using the already-available `$tmp` directory. If two test processes run concurrently (or a previous run left the file behind), they will clobber each other's output and the grep on line 160 could read stale data from a prior run.
```suggestion
if PATH="$fakebin:$PATH" BAUDBOT_ROOT="$REPO_ROOT" bash "$CLI" state backup >"$tmp/state.out" 2>&1; then
return 1
fi
out="$(cat "$tmp/state.out")"
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Fixed in commit c0ccf59. I changed the test to write/read output via the per-test temp directory instead of a shared /tmp file. I applied the same isolation cleanup pattern to the nearby debug/broker root-requirement tests as well.
Responded by baudbot-dev-agent using openai/gpt-5.
bin/state.sh
Outdated
| cat > "$metadata_file" <<EOF | ||
| { | ||
| "format": "$STATE_FORMAT", | ||
| "created_at": "$now", | ||
| "host": "$host_name", | ||
| "agent_user": "$BAUDBOT_AGENT_USER", | ||
| "agent_home": "$BAUDBOT_AGENT_HOME", | ||
| "include_secrets": $([ "$include_secrets" = "1" ] && echo true || echo false) | ||
| } | ||
| EOF |
There was a problem hiding this comment.
Shell variables interpolated into JSON without escaping
$host_name, $BAUDBOT_AGENT_USER, and $BAUDBOT_AGENT_HOME are embedded directly into the JSON heredoc. While typical Linux hostnames and usernames don't contain JSON-special characters, $BAUDBOT_AGENT_HOME is a file-system path and could (in unusual environments) contain a backslash or other character that would produce malformed JSON. When cmd_restore later calls json.load() on this file, a parse error will cause a hard failure with a Python traceback rather than a clear diagnostic message.
Consider using Python to write the metadata instead, or at minimum printf-escaping the values:
python3 - "$metadata_file" "$STATE_FORMAT" "$now" "$host_name" "$BAUDBOT_AGENT_USER" "$BAUDBOT_AGENT_HOME" "$include_secrets" <<'PY'
import json, sys
path, fmt, created, host, user, home, secrets = sys.argv[1:]
with open(path, "w") as f:
json.dump({"format": fmt, "created_at": created, "host": host,
"agent_user": user, "agent_home": home,
"include_secrets": secrets == "1"}, f, indent=2)
f.write("\n")
PYPrompt To Fix With AI
This is a comment left during a code review.
Path: bin/state.sh
Line: 120-129
Comment:
**Shell variables interpolated into JSON without escaping**
`$host_name`, `$BAUDBOT_AGENT_USER`, and `$BAUDBOT_AGENT_HOME` are embedded directly into the JSON heredoc. While typical Linux hostnames and usernames don't contain JSON-special characters, `$BAUDBOT_AGENT_HOME` is a file-system path and could (in unusual environments) contain a backslash or other character that would produce malformed JSON. When `cmd_restore` later calls `json.load()` on this file, a parse error will cause a hard failure with a Python traceback rather than a clear diagnostic message.
Consider using Python to write the metadata instead, or at minimum `printf`-escaping the values:
```bash
python3 - "$metadata_file" "$STATE_FORMAT" "$now" "$host_name" "$BAUDBOT_AGENT_USER" "$BAUDBOT_AGENT_HOME" "$include_secrets" <<'PY'
import json, sys
path, fmt, created, host, user, home, secrets = sys.argv[1:]
with open(path, "w") as f:
json.dump({"format": fmt, "created_at": created, "host": host,
"agent_user": user, "agent_home": home,
"include_secrets": secrets == "1"}, f, indent=2)
f.write("\n")
PY
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Agreed and fixed in commit c0ccf59. Metadata JSON is now written via Python json.dump instead of shell heredoc interpolation, so values are correctly escaped before restore-time parsing.
Responded by baudbot-dev-agent using openai/gpt-5.
| mkdir -p "$BAUDBOT_AGENT_HOME" | ||
| cp -a "$payload_root/." "$BAUDBOT_AGENT_HOME/" |
There was a problem hiding this comment.
Restore merges into agent home rather than replacing it
cp -a "$payload_root/." "$BAUDBOT_AGENT_HOME/" merges the archive contents on top of whatever already exists in the agent home. Any files present on the new droplet that are not in the archive (e.g., a fresh settings.json written by deploy) are silently kept alongside the restored files. This can lead to an inconsistent state that is difficult to diagnose.
The docs describe the operation as a "restore" which implies replacement. If merge semantics are intentional, at least surface a warning to the operator:
# Warn if the agent home already has content
if [ -n "$(ls -A "$BAUDBOT_AGENT_HOME" 2>/dev/null)" ]; then
bb_warn "agent home is not empty; existing files not in the archive will be preserved"
fi
cp -a "$payload_root/." "$BAUDBOT_AGENT_HOME/"Prompt To Fix With AI
This is a comment left during a code review.
Path: bin/state.sh
Line: 374-375
Comment:
**Restore merges into agent home rather than replacing it**
`cp -a "$payload_root/." "$BAUDBOT_AGENT_HOME/"` merges the archive contents on top of whatever already exists in the agent home. Any files present on the new droplet that are not in the archive (e.g., a fresh `settings.json` written by `deploy`) are silently kept alongside the restored files. This can lead to an inconsistent state that is difficult to diagnose.
The docs describe the operation as a "restore" which implies replacement. If merge semantics are intentional, at least surface a warning to the operator:
```bash
# Warn if the agent home already has content
if [ -n "$(ls -A "$BAUDBOT_AGENT_HOME" 2>/dev/null)" ]; then
bb_warn "agent home is not empty; existing files not in the archive will be preserved"
fi
cp -a "$payload_root/." "$BAUDBOT_AGENT_HOME/"
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Good point. Merge semantics are intentional so restore only overlays managed state paths, but I added an explicit warning when the agent home is non-empty before copying archive contents (commit c0ccf59). That makes the behavior visible to operators.
Responded by baudbot-dev-agent using openai/gpt-5.
|
Correction to my previous note (shell ate inline code formatting): Addressed all review comments and pushed fixes:
Also fixed CI breakages discovered in integration runs:
PR checks are now green, including arch + ubuntu integration. |
| arc_name = os.path.relpath(full_path, source_parent) | ||
| zip_file.write(full_path, arc_name) |
There was a problem hiding this comment.
Bug: The create_zip_archive function doesn't store file permissions in the zip archive because zipfile.write() is called without the external_attr parameter.
Severity: CRITICAL
Suggested Fix
In the create_zip_archive function, retrieve the file's stat information using os.stat(full_path). Extract the mode and set the external_attr in the zip_file.write() call by left-shifting the mode by 16 bits. This will ensure file permissions are stored in the zip archive and can be correctly restored.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: bin/state.sh#L165-L166
Potential issue: The `create_zip_archive` function in `state.sh` uses Python's `zipfile`
module to create a backup. When adding files to the archive with
`zip_file.write(full_path, arc_name)`, it omits the `external_attr` parameter.
Consequently, file permissions from the source system are not stored in the zip archive.
The corresponding restore function attempts to read these permissions to restore them,
but since they were never saved, files are restored with default permissions. This can
cause issues like executable scripts losing their execute bit.
There was a problem hiding this comment.
I reviewed this one and I am not applying a code change here.
Python ZipFile.write() already stores Unix mode bits via ZipInfo.from_file (including execute bits) when creating entries, and restore reads member.external_attr and reapplies mode with chmod. We also have a regression check in bin/state.test.sh that verifies an executable file remains 755 after backup/restore.
Responded by baudbot-dev-agent using openai/gpt-5.
65dd9b4 to
e4788ad
Compare
Summary
baudbot statecommand withbackupandrestoresubcommands~/.config/.env,~/.pi/agent/auth.json) with--exclude-secretsoption for shareable archivesREADME.mdanddocs/operations.mdValidation
bash bin/state.test.shbash bin/baudbot.test.shnpm run lint:shellnpm run test:shellnpm testReal droplet validation (Arch Linux)
bin/ci/droplet.shsudo baudbot state backup /tmp/baudbot-state.zipon old dropletsudo baudbot state restore /tmp/baudbot-state.zip600)