CASSANDRA-21129: Offline TCM dump tool#4581
Open
dracarys09 wants to merge 4 commits intoapache:trunkfrom
Open
Conversation
When a Cassandra node fails to start due to Transactional Cluster Metadata (TCM/CEP-21) corruption or issues, operators need a way to inspect the cluster metadata state offline without starting the node. The existing tools (nodetool, cqlsh) require a running node, leaving operators blind when debugging startup failures. With CEP-21 (Transactional Cluster Metadata), cluster metadata is stored in system tables: * system.local_metadata_log - Contains transformation entries (epoch -> transformation) * system.metadata_snapshots - Contains periodic snapshots of ClusterMetadata When a node fails to start due to TCM corruption or inconsistencies, operators have no way to inspect the metadata state without a running node. This tool fills that gap by reading directly from SSTables.
krummas
requested changes
Jan 26, 2026
Member
krummas
left a comment
There was a problem hiding this comment.
So this is an emergency recovery tool, hopefully extremely rarely used by an operator, I think we can slim it down a lot, these are the features I think we need here:
- dump metadata to current (or user provided) epoch
- serialized binary format
- metadata.toString, to avoid locking us in to any formats
- dump log (with start/end epoch), just toString each entry
- maybe add option to dump
system_clustermetadata.distributed_metadata_logif this is run on a CMS node
issues;
- shell script should live in tools/bin/ directory
- tool name - this does not dump sstable metadata, it dumps cluster metadata from sstables, sstable metadata is something different (see tools/bin/sstablemetadata)
- it copies the sstables to $CASSANDRA_HOME/data (or, if that is unset, in to the current directory) - we should create a temporary directory for import and clean that directory up after dumping the metadata, we need something like
Path p = Files.createTempDirectory("dumptcmlog");
DatabaseDescriptor.getRawConfig().data_file_directories = new String[] {p.resolve("data").toString()};
DatabaseDescriptor.getRawConfig().commitlog_directory = p.resolve("commitlog").toString();
DatabaseDescriptor.getRawConfig().accord.journal_directory = p.resolve("accord_journal").toString();
DatabaseDescriptor.getRawConfig().hints_directory = p.resolve("hints").toString();
DatabaseDescriptor.getRawConfig().saved_caches_directory = p.resolve("saved_caches").toString();
to make sure we only touch the tmp directory
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When a Cassandra node fails to start due to Transactional Cluster Metadata (TCM/CEP-21) corruption or issues, operators need a way to inspect the cluster metadata state offline without starting the node. The existing tools (nodetool, cqlsh) require a running node, leaving operators blind when debugging startup failures.
With CEP-21 (Transactional Cluster Metadata), cluster metadata is stored in system tables:
When a node fails to start due to TCM corruption or inconsistencies, operators have no way to inspect the metadata state without a running node. This tool fills that gap by reading directly from SSTables.
Thanks for sending a pull request! Here are some tips if you're new here:
Commit messages should follow the following format:
The Cassandra Jira