STAC-24549: Handle PARTIAL snapshots in ES restore and add describe command #25
Merged
STAC-24549: Handle PARTIAL snapshots in ES restore and add describe command #25
Conversation
…ommand - Fix JSON unmarshalling of snapshot failures (was []string, now []SnapshotFailure) - Warn users before restoring PARTIAL snapshots with explicit confirmation - Add --allow-partial flag for non-interactive PARTIAL restore (required with --yes) - Pass partial=true to ES restore API for PARTIAL snapshots to avoid "wasn't fully snapshotted" errors - Add elasticsearch describe command to show snapshot details as pretty JSON
Contributor
Author
❯ go run main.go elasticsearch restore --namespace nightly-restore --latest
Setting up port-forward to suse-observability-elasticsearch-master-headless:9200 in namespace nightly-restore...
✅ Port-forward established on localhost:54995
Fetching latest snapshot from repository 'sts-backup'...
✅ Latest snapshot found: sts-backup-20260401-0300-ggheroepr2aj24ebp15qpq
⚠️ Warning: WARNING: Snapshot 'sts-backup-20260401-0300-ggheroepr2aj24ebp15qpq' is in PARTIAL state!
⚠️ Warning: 51 shard(s) failed out of 267 total (216 successful)
⚠️ Warning: Restoring this snapshot will result in incomplete data for the failed shards.
Do you want to continue? (yes/no): yes
⚠️ Warning: WARNING: Restoring from snapshot will DELETE all existing STS indices!
⚠️ Warning: This operation cannot be undone.
Snapshot to restore: sts-backup-20260401-0300-ggheroepr2aj24ebp15qpq
Snapshot state: PARTIAL
Namespace: nightly-restore
Do you want to continue? (yes/no): yes
Scaling down deployments (selector: observability.suse.com/scalable-during-es-restore=true)...
✅ Scaled down 3 deployment(s):
- suse-observability-e2es (replicas: 0 -> 0)
- suse-observability-receiver-base (replicas: 0 -> 0)
- suse-observability-receiver-logs (replicas: 0 -> 0)
✅ Scaled down 0 statefulsets(s):
Waiting for pods to terminate...
✅ All pods have terminated
Fetching current Elasticsearch indices...
Found 90 STS index(es) to delete
Rolling over datastream 'sts_k8s_logs'...
✅ Datastream rolled over successfully
Deleting 90 index(es)...
Deleting index: .ds-sts_k8s_logs-2026.03.27-008019
Deleting index: .ds-sts_k8s_logs-2026.03.30-008073
Deleting index: .ds-sts_k8s_logs-2026.03.30-008075
Deleting index: sts_topology_events-2026.04.01
Deleting index: .ds-sts_k8s_logs-2026.03.30-008071
Deleting index: .ds-sts_k8s_logs-2026.03.29-008063
Deleting index: .ds-sts_k8s_logs-2026.03.29-008065
Deleting index: .ds-sts_k8s_logs-2026.03.31-008109
Deleting index: .ds-sts_k8s_logs-2026.03.29-008061
Deleting index: .ds-sts_k8s_logs-2026.03.31-008107
Deleting index: .ds-sts_k8s_logs-2026.03.28-008025
Deleting index: .ds-sts_k8s_logs-2026.03.31-008105
Deleting index: .ds-sts_k8s_logs-2026.03.27-008021
Deleting index: .ds-sts_k8s_logs-2026.03.30-008069
Deleting index: .ds-sts_k8s_logs-2026.03.28-008023
Deleting index: .ds-sts_k8s_logs-2026.03.31-008103
Deleting index: sts_topology_events-2026.03.31
Deleting index: .ds-sts_k8s_logs-2026.03.28-008029
Deleting index: .ds-sts_k8s_logs-2026.03.31-008101
Deleting index: sts_topology_events-2026.03.30
Deleting index: .ds-sts_k8s_logs-2026.03.30-008067
Deleting index: .ds-sts_k8s_logs-2026.03.28-008027
Deleting index: sts_topology_events-2026.03.03
Deleting index: .ds-sts_k8s_logs-2026.03.30-008083
Deleting index: .ds-sts_k8s_logs-2026.03.26-007997
Deleting index: .ds-sts_k8s_logs-2026.03.30-008085
Deleting index: sts_topology_events-2026.03.06
Deleting index: .ds-sts_k8s_logs-2026.03.26-007999
Deleting index: sts_topology_events-2026.03.07
Deleting index: sts_topology_events-2026.03.04
Deleting index: sts_topology_events-2026.03.05
Deleting index: .ds-sts_k8s_logs-2026.03.30-008081
Deleting index: .ds-sts_k8s_logs-2026.04.01-008111
Deleting index: sts_topology_events-2026.03.08
Deleting index: sts_topology_events-2026.03.09
Deleting index: .ds-sts_k8s_logs-2026.03.30-008077
Deleting index: .ds-sts_k8s_logs-2026.03.30-008079
Deleting index: .ds-sts_k8s_logs-2026.03.31-008099
Deleting index: sts_topology_events-2026.03.13
Deleting index: sts_topology_events-2026.03.14
Deleting index: .ds-sts_k8s_logs-2026.03.31-008097
Deleting index: sts_topology_events-2026.03.11
Deleting index: sts_topology_events-2026.03.12
Deleting index: .ds-sts_k8s_logs-2026.03.31-008095
Deleting index: sts_topology_events-2026.03.17
Deleting index: sts_topology_events-2026.03.18
Deleting index: .ds-sts_k8s_logs-2026.03.31-008093
Deleting index: sts_topology_events-2026.03.15
Deleting index: sts_topology_events-2026.03.16
Deleting index: .ds-sts_k8s_logs-2026.03.31-008091
Deleting index: sts_topology_events-2026.03.19
Deleting index: .ds-sts_k8s_logs-2026.03.28-008043
Deleting index: .ds-sts_k8s_logs-2026.04.02-008114
Deleting index: .ds-sts_k8s_logs-2026.03.28-008041
Deleting index: .ds-sts_k8s_logs-2026.04.01-008113
Deleting index: .ds-sts_k8s_logs-2026.03.29-008049
Deleting index: .ds-sts_k8s_logs-2026.03.27-008001
Deleting index: .ds-sts_k8s_logs-2026.03.29-008045
Deleting index: .ds-sts_k8s_logs-2026.03.27-008003
Deleting index: sts_topology_events-2026.03.10
Deleting index: .ds-sts_k8s_logs-2026.03.30-008087
Deleting index: .ds-sts_k8s_logs-2026.03.31-008089
Deleting index: .ds-sts_k8s_logs-2026.03.29-008047
Deleting index: .ds-sts_k8s_logs-2026.03.27-008005
Deleting index: sts_topology_events-2026.03.24
Deleting index: .ds-sts_k8s_logs-2026.03.27-008007
Deleting index: sts_topology_events-2026.03.25
Deleting index: sts_topology_events-2026.03.22
Deleting index: sts_topology_events-2026.03.23
Deleting index: .ds-sts_k8s_logs-2026.03.27-008009
Deleting index: sts_topology_events-2026.03.28
Deleting index: sts_topology_events-2026.03.29
Deleting index: sts_topology_events-2026.03.26
Deleting index: sts_topology_events-2026.03.27
Deleting index: .ds-sts_k8s_logs-2026.03.29-008053
Deleting index: .ds-sts_k8s_logs-2026.03.29-008055
Deleting index: .ds-sts_k8s_logs-2026.03.28-008031
Deleting index: .ds-sts_k8s_logs-2026.03.29-008051
Deleting index: .ds-sts_k8s_logs-2026.03.27-008011
Deleting index: .ds-sts_k8s_logs-2026.03.28-008035
Deleting index: .ds-sts_k8s_logs-2026.03.28-008033
Deleting index: .ds-sts_k8s_logs-2026.03.27-008013
Deleting index: .ds-sts_k8s_logs-2026.03.27-008015
Deleting index: .ds-sts_k8s_logs-2026.03.28-008039
Deleting index: sts_topology_events-2026.03.20
Deleting index: sts_topology_events-2026.03.21
Deleting index: .ds-sts_k8s_logs-2026.03.29-008057
Deleting index: .ds-sts_k8s_logs-2026.03.28-008037
Deleting index: .ds-sts_k8s_logs-2026.03.27-008017
Deleting index: .ds-sts_k8s_logs-2026.03.29-008059
✅ All indices deleted successfully
Triggering restore for snapshot: sts-backup-20260401-0300-ggheroepr2aj24ebp15qpq
✅ Restore triggered successfully
Checking restore status for snapshot: sts-backup-20260401-0300-ggheroepr2aj24ebp15qpq
✅ Restore completed successfully
Finalizing restore...
Scaling up deployments from annotations (selector: observability.suse.com/scalable-during-es-restore=true)...
✅ Scaled up 3 deployment(s) successfully:
- suse-observability-e2es (replicas: 0 -> 0)
- suse-observability-receiver-base (replicas: 0 -> 0)
- suse-observability-receiver-logs (replicas: 0 -> 0)
✅ Scaled up 0 statefulset(s) successfully:
✅ Finalization completed successfully |
Contributor
Author
❯ go run main.go elasticsearch restore --namespace nightly-restore --latest --yes
Setting up port-forward to suse-observability-elasticsearch-master-headless:9200 in namespace nightly-restore...
✅ Port-forward established on localhost:55124
Fetching latest snapshot from repository 'sts-backup'...
✅ Latest snapshot found: sts-backup-20260401-0300-ggheroepr2aj24ebp15qpq
❌ Error: snapshot 'sts-backup-20260401-0300-ggheroepr2aj24ebp15qpq' is PARTIAL with 51 shard failure(s); use --allow-partial together with --yes to restore a partial snapshot non-interactively
exit status 1 |
Contributor
Author
❯ go run main.go elasticsearch describe --namespace nightly-restore -s sts-backup-20260401-0300-ggheroepr2aj24ebp15qpq
Setting up port-forward to suse-observability-elasticsearch-master-headless:9200 in namespace nightly-restore...
✅ Port-forward established on localhost:55283
Fetching snapshot 'sts-backup-20260401-0300-ggheroepr2aj24ebp15qpq' from repository 'sts-backup'...
{
"snapshot": "sts-backup-20260401-0300-ggheroepr2aj24ebp15qpq",
"uuid": "CE8LQ3_HTo6dmi0iiZC6LQ",
"repository": "sts-backup",
"state": "PARTIAL",
"start_time": "2026-04-01T02:59:59.864Z",
"start_time_in_millis": 1775012399864,
"end_time": "2026-04-01T03:04:48.697Z",
"end_time_in_millis": 1775012688697,
"duration_in_millis": 288833,
"indices": [
".ds-sts_k8s_logs-2026.03.28-008041",
".ds-sts_k8s_logs-2026.03.28-008025",
".ds-sts_k8s_logs-2026.03.31-008097",
...
],
"failures": [
{
"index": ".ds-sts_k8s_logs-2026.03.30-008069",
"index_uuid": "u82xI1lgToOVOL-UGO00RQ",
"shard_id": 2,
"reason": "IOException[Unable to upload object [nightly/elasticsearch/indices/uVVwxR7ySKinTlIgnz8axg/2/__-OY_MWpfR2OH1XM8fLLa1w] using a single upload]; nested: SdkClientException[Unable to execute HTTP request: Connect to suse-observability-s3proxy:9000 [suse-observability-s3proxy/10.0.240.76] failed: Connection refused (SDK Attempt Count: 1)]; nested: HttpHostConnectException[Connect to suse-observability-s3proxy:9000 [suse-observability-s3proxy/10.0.240.76] failed: Connection refused]; nested: ConnectException[Connection refused]",
"node_id": "J4QengskRpGJfaB8uwWymw",
"status": "INTERNAL_SERVER_ERROR"
},
{
"index": ".ds-sts_k8s_logs-2026.03.31-008089",
"index_uuid": "sn2Wfve3T3yNLQZFWWbRPA",
"shard_id": 2,
"reason": "IOException[Unable to upload object [nightly/elasticsearch/indices/x_sMK1BmR9i4T6aE2WsdUg/2/__e3jCZU4-StKTu_N-mGev3A] using a single upload]; nested: SdkClientException[Unable to execute HTTP request: Connect to suse-observability-s3proxy:9000 [suse-observability-s3proxy/10.0.240.76] failed: Connection refused (SDK Attempt Count: 1)]; nested: HttpHostConnectException[Connect to suse-observability-s3proxy:9000 [suse-observability-s3proxy/10.0.240.76] failed: Connection refused]; nested: ConnectException[Connection refused]",
"node_id": "J4QengskRpGJfaB8uwWymw",
"status": "INTERNAL_SERVER_ERROR"
},
...
],
"shards": {
"total": 267,
"failed": 51,
"successful": 216
}
} |
rb3ckers
approved these changes
Apr 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary