Skip to content

Conversation

@shubpal07
Copy link
Contributor

Summary of Changes

The gcluster destroy command has been enhanced with a --robust flag to improve the reliability of resource cleanup, particularly for firewall rules.

The --robust flag introduces these key features:

  1. Targeted Firewall Cleanup: Firewall rule cleanup is performed only for deployment groups that contain a network module (specifically, a module with source: "modules/network/vpc"). The cleanup for a group happens just before that specific group's resources are destroyed. This ensures firewalls are removed at the most appropriate time to avoid "resource in use" errors. The cleanup now uses the Google Cloud Go SDK instead of external gcloud calls.
  2. Automatic Retry Mechanism: The entire destroy process for all selected groups is retried if any failure occurs.
    • Retry Count: The command will make up to 3 attempts to destroy the deployment successfully.
    • Purpose: This helps overcome transient issues or dependencies between resources that might not be immediately resolved in the cloud backend.

User Interaction

  • Firewall Deletion Confirmation: When --robust is used and firewall rules are found for a group, the user is prompted to confirm their deletion [y/n], unless --auto-approve is also specified.
  • Automated Mode: The --auto-approve flag suppresses all prompts, including firewall deletion confirmations, for automated workflows.

How to Use

Standard Robust Destroy (Interactive)

This will prompt you for confirmation before deleting firewall rules.

./gcluster destroy <DEPLOYMENT_DIRECTORY> --robust

Fully Automated Robust Destroy

This will automatically approve and delete all resources, including firewall rules, without any user prompts.

./gcluster destroy <DEPLOYMENT_DIRECTORY> --robust --auto-approve

Standard Destroy

The original destroy functionality remains unchanged. The robust logic is only triggered when the --robust flag is present.

# Standard interactive destroy
./gcluster destroy <DEPLOYMENT_DIRECTORY>

# Standard automated destroy
./gcluster destroy <DEPLOYMENT_DIRECTORY> --auto-approve

How to Test

  1. Deploy a Cluster: First, create a sample deployment if you don't have one already.

    ./gcluster deploy examples/hpc-slurm.yaml
  2. Build the Tool: Ensure you have the latest version of the gcluster binary with these changes.

    make gcluster
  3. Run Robust Destroy: Execute the destroy command with the --robust flag on your deployment directory.

    ./gcluster destroy <YOUR_DEPLOYMENT_DIRECTORY> --robust
  4. Verify the Prompt: The tool should list the firewall rules it intends to delete and wait for your input.

    • Enter y to confirm and watch the process continue.
    • (Optional) Re-deploy and run again, this time entering n to verify that the firewall cleanup is skipped.
  5. Verify Automated Destroy: (Optional) If you want to test the non-interactive mode, re-deploy the cluster and run:

    ./gcluster destroy <YOUR_DEPLOYMENT_DIRECTORY> --robust --auto-approve

    The command should complete the entire destruction process without any prompts

Submission Checklist

NOTE: Community submissions can take up to 2 weeks to be reviewed.

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cluster Toolkit Contribution guidelines #

@shubpal07 shubpal07 self-assigned this Nov 21, 2025
@shubpal07 shubpal07 added the release-improvements Added to release notes under the "Improvements" heading. label Nov 21, 2025
@shubpal07
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a --robust flag to the gcluster destroy command, adding functionality for targeted firewall cleanup and an automatic retry mechanism. The changes are well-structured, primarily within cmd/destroy.go. My review focuses on enhancing the robustness of the new error handling and resource cleanup logic to ensure the retry mechanism behaves as expected under failure conditions.

@shubpal07 shubpal07 marked this pull request as ready for review December 3, 2025 06:03
@shubpal07 shubpal07 requested review from a team and samskillman as code owners December 3, 2025 06:03
@shubpal07 shubpal07 requested review from bytetwin and cboneti December 3, 2025 06:03
@shubpal07 shubpal07 force-pushed the shubham/robust-destroy branch from 9758ff6 to df4c3fc Compare December 3, 2025 06:11
@shubpal07 shubpal07 marked this pull request as draft December 4, 2025 08:51
@shubpal07 shubpal07 marked this pull request as ready for review December 9, 2025 04:40
@shubpal07 shubpal07 force-pushed the shubham/robust-destroy branch from 9f175f3 to 7450b9a Compare December 9, 2025 05:29
@sarthakag
Copy link
Contributor

Can we make this the default behaviour for gcluster destroy command, instead of under the specific --robust flag? This would fix the incomplete destroy behaviour.

@cboneti
Copy link
Member

cboneti commented Dec 13, 2025

Can we make this the default behaviour for gcluster destroy command, instead of under the specific --robust flag? This would fix the incomplete destroy behaviour.

I would prefer that we don't change the default yet. Let's test this extensively internally. Most external customers don't need this feature.

Copy link
Member

@cboneti cboneti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good new feature. Thanks!!

@vikramvs-gg vikramvs-gg merged commit 6791739 into GoogleCloudPlatform:develop Dec 23, 2025
19 of 72 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-improvements Added to release notes under the "Improvements" heading.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants